“Tere er 7 


bAd 1 ELLE 
RENCONTRES 


~ 1967 LECTURES IN MATHEMATICS AND PHYSICS 
rt MnP#¢ 


HELCASON 
t pICHNEROWICZ 
; CHOOUEEBRUHAT 
he MISNER 
z PENROSE 
: GEROCH 

x WHEELER 

ESLTIW VEER 

PetvirT 
HEY? 

L ASO 
PHAM 
REGGE 
BOTT 
ECKMANN 
MATHER 
rREMEED 


« damian <td eee 


° 
ba 
a, 
S$em 


BATTELLE RENCONTRE S 


BATTELLE 
RENCONTRES 


1967 LECTURES IN 
MATHEMATICS AND PHYSICS 


EDITED BY 


Cecile M. DeWitt 


University of North Carolina 
at Chapel Hill 


A N D 


John A. Wheeler 


Princeton University 


0 


W. A. BENJAMIN, INC. New York « Amsterdam « 1968 


BATTELLE RENCONTRES 
1967 Lectures in Mathematics and Physics 


Copyright © 1968 by W. A. Benjamin, Inc. 

All rights reserved 

Library of Congress Catalog Card Number 68-24362 
Manufactured in the United States of America 
12345M321098 


W. A. BENJAMIN, INC. New York, New York 10016 


List of Participants 


RAOUL H. BOTT 
Department of Mathematics, Harvard University, Cambridge, Massachusetts 


02138 
JOHN B. BOYLING 
Department of Applied Mathematics and Theoretical Physics, Silver Street, 
Cambridge, England 
BRANDON CARTER 
Department of Applied Mathematics and Theoretical Physics, Silver Street, 
Cambridge, England 
YVONNE CHOQUET-BRUHAT 
Departement de Mathématiques, Université de Paris, 11 rue Pierre Curie, 
Paris 5, France 
BRYCE S. DEWITT 
Department of Physics, University of North Carolina, Chapel Hill, North 
Carolina 27514 
CECILE DEWITT 
Department of Physics, University of North Carolina, Chapel Hill, North 
Carolina 27514 


BENO ECKMANN 
Hohere Mathematik, E.T.H., Ztirich, Switzerland 


LEON EHRENPREIS 
Courant Institute of Mathematical Sciences, New York University, New York, 
N. Y. 10012 
PAUL FEDERBUSH 
Department of Mathematics, Massachusetts Institute of Technology, Cam- 
bridge, Massachusetts 02139 


DIMITRI FOTIADI 
Centre de Physique Théorique, Ecole Polytechnique, Paris 5, France 


MARCEL FROISSART 
Service de Physique Théorique, Saclay, 91 Gif-sur-Yvette, France 
CHARLES J. FREIFELD 
Department of Mathematics, Harvard University, Cambridge, Massachusetts 
02138 


vi LIST OF PARTICIPANTS 


ROBERT P. GEROCH 
Palmer Physical Laboratory, Princeton University, Princeton, New Jersey 08540 
JACK GUNSON 
Department of Mathematical Physics, University of Birmingham, England 
STEPHEN W. HAWKING 
Department of Applied Mathematics and Theoretical Physics, Silver Street, 
Cambridge, England 


SIGURDUR HELGASON 
Department of Mathematics, Massachusetts Institute of Technology, Cam- 
bridge, Massachusetts 02139 


KLAUS HEPP 
Theoretische Physik, E.T.H., Zurich, Switzerland 
HEISUKE HIRONAKA 
Department of Mathematics, Columbia University, New York, N. Y. 10027 
JEAN LASCOUX 
Centre de Physique Théorique, Ecole Polytechnique, Paris 5, France 
HENRY LEUTWYLER 
Institut de Physique Théorique, Université de Berne, Sidlerstrasse 5, Berne, 
Switzerland 
ANDRE LICHNEROWICZ 
Physique Mathématique, Collége de France, Paris 5, France 
JOHN N. MATHER 
Department of Mathematics, Princeton University, Princeton, New Jersey 08540 
CHARLES W. MISNER 
Department of Physics and Astronomy, University of Maryland, College Park, 
Maryland 20740 


BERNARD C. MORIN 
Department of Mathematics, University of Michigan, Ann Arbor, Michigan 
48104 
ROGER PENROSE 
Department of Physics, Cornell University, Ithaca, New York 14850 


FREDERIC PHAM 
Theory Division, CERN, Geneva 23, Switzerland 


TULLIO REGGE 
International Centre for Theoretical Physics, Trieste, Italy 


MICHAEL I. SHUB 
Department of Mathematics, University of California at Berkeley 94720 


STEPHEN SMALE 

Department of Mathematics, University of California at Berkeley 94720 
NORMAN E. STEENROD 

Department of Mathematics, Princeton University, Princeton, New Jersey 08540 


RENE THOM 
Institut des Hautes Etudes Scientifiques, 91 Bures-sur-Yvette, France 


Battelle Rencontres Vil 


JOHN A. WHEELER 
Palmer Physical Laboratory, Princeton University, Princeton, New cee 08540 
JAMES W. YORK, Jr. 
Department of Physics, North Carolina State University, Raleigh, North Caro- 
lina 27607 


Babel and Battelle 


In 1947 John von Neumann wrote, “‘ Mathematics falls into a great 
number of subdivisions, differing from one another widely in character, 
style, aims, and influence. It shows the very opposite of the extreme con- 
centration of theoretical physics. A good theoretical physicist may today still 
have a working knowledge of more than half of his subject. I doubt that 
any mathematician now living has much of a relationship to more than a 
quarter.”’ In the subsequent two decades, mathematics and physics have 
grown fantastically; and farther than ever from being Two Cultures, in the 
sense of Snow, they sometimes seem closer to being a hundred cultures, a 
modern version of the Tower of Babel. Those who work on molecular poten- 
tial energy surfaces, on deutreron stripping reactions, on gravitational waves, 
or on parity nonconservation and the decay of elementary particles often seem 
to be as little aware of each other’s aims and results as those who work on 
point set topology or on mathematical logic or on rings of operators. Why? 
It is not want of interest. It is want of opportunity. Hence the Battelle 
Rencontres in Mathematics and Physics! 

Rare is the man in any area of mathematics or physics who does not 
feel that he is missing important developments, even ideas significant for his 
own life work. Who would not be happy to have better contact with the 
fields in mathematics and physics that are richest in new insights and out- 
looks! Keep up by reading all the literature? Impossible. No one can 
read or even scan the fifteen papers that appear every day in mathematics and 
the one hundred papers in physics. Would it help to wait for the day of 
“electronic information retrieval’? Hardly! To supply with more speed 
the same enormous bulk of printed matter, or even the abstracts to it, would 
seem the right answer to the wrong question. The right question? Did 
not Einstein state it in his biography when he spoke of his lifelong endeavor 
to capture the essence of a new idea in a key phrase or in a few sentences so 
that one can convey it to a colleague in all its vividness, free alike of equations 
and of jargon? The heart of our information system is not the memory of a 
computer but the mind of a colleague. 


1X 


X BABEL AND BATTELLE 


It is not new for colleagues from two distinct fields to meet together. 
Of a larger meeting one can say what Robert Oppenheimer said of tea, “It is 
where we explain to each other what we don’t understand.’ Yet even a 
meeting of a week or more has its problems. Enough voices are there to be 
heard to give the whole enterprise too often a fragmented character. Col- 
leagueship will not be hurried. There is no substitute for quietness and slow 
time and walks and talks. 

Quietness and slow time, yes; but urgency too. Who does not know 
from his own experience what he has lost in time or stimulus or insight or 
pay-off from his work through not knowing at the time—as he did later—the 
work of his colleague in the other field? From discussions with colleagues 
both in mathematics and physics we know how widespread this feeling is. 
Sometimes it even reaches the stage of desperation. The words of Clemen- 
ceau about generals are revised to read, ““ Mathematics is too important to be 
left to the mathematicians’’ or “‘ Physics is too important to be left to the 
physicists.”” From uttering words as strong as these it is but a small step to 
resolve to do something about exploring one’s neighbor discipline. But 
how? 

In the spring of 1966 Dr. Bertram D. Thomas, president of Battelle 
Memorial Institute, and Dr. Frederick J. Milford, director of Battelle 
Memorial Institute research in physics, approached us with the proposal 
that we invite a small number of mathematicians and a small number of 
physicists to meet together for some weeks on a topic or in an area where 
fruitful interactions could be expected. They promised to put at the disposal 
of the group the Battelle-Seattle Research Center. They promised funds 
for the enterprise. They promised to continue it in future summers 
if it worked out successfully. Feeling as we did we had no choice but to 
accept. This volume is the record of the first ‘“‘ Rencontres”’ as the meeting 
came to be called. 

Directing Battelle is one activity; giving physics and mathematics a 
helping hand is another. What is the link? Gordon Battelle, when he died 
in Columbus, Ohio, September 21, 1923, the last of his family line, left his 
estate to found an institute. In writing his will, he sought for society “the 
social and economic benefits to be derived from scientific research and from 
the making of discoveries and inventions.’ The Battelle Memorial Institute, 
originally concerned, like Gordon Battelle himself, with mining and metal- 
lurgy, and a prime contributor during World War II to the metallurgy of 
uranium and with protective coatings for uranium, has today a far wider 
scope. From xerography its range of interests and contributions extend as 
far as drawing up plans for the Department of the Interior for the economic 
development of Alaska; and from a revolutionary technique for making 
reinforced concrete to holography. A scientific and charitable foundation, 
the Battelle Memorial Institute has grown over the years both in size and in 


Babel and Battelle x] 


science. Its first laboratory was in Columbus; subsequent laboratories in 
Frankfurt, Germany, Geneva, Switzerland, and Richland, Washington have 
brought the total staff to nearly 6500. With this increase in staff has gone an 
increase in the number of specialties represented from two or three as at the 
beginning to many score as of today. Dr. Thomas told us of the indebtedness 
that Battelle feels to all sectors of the community of learning as it goes on 
today in its task of applying science to meet the needs of society. He spoke 
in particular of Battelle’s indebtedness to mathematics, the “queen of the 
sciences”’ and to physics, the foundation stone of all the physical sciences. 
Colleagues in these two disciplines had spoken to him and to Dr. Milford on 
more than one occasion of their regret that investigators in these two fields 
were not as close companions along the road as they had once been. If 
Battelle, closer than ever to the world of learning, were to enter into still more 
fraternal relations with the men who make up that world, were not then 
mathematics and physics the subjects with which to begin? Hence the 
Battelle initiative. 

Discussions with Dr. Thomas, Dr. Milford, and colleagues active in 
mathematics and physics soon made it clear that for the best results it was 
essential to focus on an area of active current interest to both disciplines. 
These considerations led us, for the experiment of the first summer, to dif- 
ferential geometry and global topology on the mathematical side and to two 
areas of physics with strong ties to this realm of mathematics: Einstein’s 
geometrical theory of gravitation; and the Feynman amplitudes or integrals 
which express the content of quantum mechanical perturbation theory, 
whether that theory is applied to gravitational interactions, or to electro- 
magnetism, or to solid state physics. 

Thirty-three workers in these fields met at the Battelle-Seattle Center, 
adjacent to the University of Washington campus, from 16 July to 31 August 
1967. To provide a framework for discussion, four speakers were invited to 
give a series of lectures on topics of central interest to the participants. 
S. Helgason lectured on Lie groups and symmetric spaces, R. Penrose on 
differential geometry, spinors and space-time singularities, R. Bott on Morse 
theory and its applications to homotopy theory, and J. Lascoux on the topo- 
logical analysis of Feynman amplitudes. 

It was not intended that these lectures should contain new material— 
though here and there they do. The purpose was frankly pedagogical. The 
presentations were so clear that these lectures are included in the present 
volume in order to make them accessible to those working in neighbor 
fields as well as to specialists. In addition to these lectures there were 
informal accounts contributed by the participants at large. These were of 
two types: (1) expository lectures outlining the background, significance, and 
goals of the areas of research in which the speakers themselves are engaged 
—with technical details held to a mimimum—and (2) lectures on technical 


oa BABEL AND BATTELLE 


details, specifically requested by the participants to clarify otherwise obscure 
points. Many of these presentations were subsequently transformed to 
written form. They constitute the remainder of the present volume. In 
some cases the material of a lecture is already so well presented elsewhere that 
it seemed superfluous to include it here. Instead, only the pertinent reference 
is given. 

A glance at the table of contents shows what an astonishing range of 
topics were discussed at the Rencontres: everything from Lie groups to the 
stability of differential systems, from homotopy theory to the quantization of 
the gravitational field, and from differential geometry to S-matrix theory. 
The fact that the majority of participants attended all the discussions shows 
the range of their interests and curiosity. May this book provide hours of 
delectable stimulus to the reader with an active mind and equally catholic 
tastes! 

What can one conclude from the summer’s experiment about the 
desirability and feasibility of widening the lines of communication between 
workers in mathematics and workers in physics? First, there is no universal 
answer valid foreveryone. There are as many styles of investigation as there 
are people. On the ever-widening front line of research there is room for 
almost every type of personality. There are some brilliant workers in 
mathematical physics who disdain mathematics; and there are some distin- 
guished mathematicians who have no interest in physics. Our surprise was 
only to find out how many workers in the one field want to learn more about 
the other, and conversely—as evidenced by spending seven weeks together. 
Second, mathematicians turned out rarely to be interested in having the 
results of physics translated into mathematical terms. When a new develop- 
ment came under discussion they wanted to get the key idea without the 
details. Third, physicists came to realize more clearly than ever the falsity of 
two old adages: “Just learn mathematics; then all your problems will be 
solved”’; or “Just take your problem to a mathematician; then it will be 
solved!” What a physicist can gain from his mathematical colleagues is 
understanding of the general principles and some feel for the problems of 
active current interest. When he learns something, it will not be the answer 
to a particular problem; it will be a broad new insight. And most of all the 
physicist found that the new insight is converted into new power only by hard 
study and hard work. Fourth, communication demands time. One who 
has studied a language only at school and then finds himself in the appropriate 
country knows how two weeks go by before he begins to understand what ts 
being said. Only then does the whole new world really begin to pour in upon 
him with all its richness. So it was in these Rencontres! The experiment 
was a success! 

We cannot close without thanking speakers and participants for their 
manifold contributions. This book is the fruit of their labor. We also 


Babel and Battelle Xili 


express Our appreciation to the advisors who counseled us about the choice of 
topics and participants. We are grateful to our colleagues at the University 
of Washington for their warm hospitality. Most of all we are indebted to the 
officers and staff of Battelle who proposed the Rencontres in the first place, 
gave vital support at every step of the way, and through their helpfulness made 
the summer’s work a pleasure. 


Cecile M. DeWitt 
John A. Wheeler 


24 February 1968 


Contents 


VII 


VITl 


TX 


x 


LIST OF PARTICIPANTS 


BABEL AND BATTELLE 


Lie Groups and Symmetric Spaces 
Sigurdur Helgason 


Special Functions and Representations of Lie Groups* 
Leon Ehrenpreis 


72 


Commutativité de l’algébre des opérateurs différentiels invariants 


sur un espace symétrique 
André Lichnerowicz 


Hyperbolic Partial Differential Equations on a Manifold 
Yvonne Choquet-Bruhat 

Topics on Space-Time 
André Lichnerowicz 


Relativistic Fluids in Cosmology 
Charles W. Misner 


Structure of Space-Time 
Roger Penrose 


The Structure of Singularities 
Robert Geroch 


Superspace and the Nature of Quantum Geometrodynamics 
John Archibald Wheeler 


The Topology of Wheeler’s Superspace* 
Bryce S. DeWitt 


73 


84 


107 


117 


12] 


236 


242 


308 


* No manuscript has been submitted; only the title of the talk is given, with reference 
to material already published. 


XV 


XVI 


XI 


XII 


XIII 


XIV 


XV 


XVI 


XVII 


XVIII 


XIX 


XX 


XXI 


XXII 


XXIII 


BATTELLE RENCONTRES 
Boundary Conditions for the State Functional in Quantum 
Theory of Gravity 309 
H. Leutwyler 
The Everett-Wheeler Interpretation of Quantum Mechanics 3/8 
Bryce S. DeWitt 
Progress and Goals in Renormalization Theory 333 
Klaus Hepp 
Perturbation Theory in Quantum Field Theory and Homology 354 
Jean Lascoux 
Landau Singularities in the Physical Region 420 
Frederic Pham 
Algebraic Topology Methods in the Theory of Feynman 
Relativistic Amplitudes 433 
Tullio Regge 
The Use of Padé Approximations in Particle Physics* 459 
Marcel Froissart 
Topics in Topology and Differential Geometry 460 
Raoul Bott and John Mather 
Continuous Solutions of Linear Equations—Some Excep- 
tional Dimensions in Topology 516 


Beno Eckmann 


Differentiable Dynamical Systems* 527 
Stephen Smale 


Characterization of Stable Mappings 528 
John N. Mather 


One-Parameter Subgroups Do Not Fill a Neighborhood of the 
Identity in an Infinite-Dimensional Lie (Pseudo-) Group 538 


Charles Freifeld 


A Dynamical Theory for Morphogenesis: Elementary Catas- 
trophes on R* * 544 


René Thom 


* No manuscript has been submitted; only the title of the talk is given, with reference 
to material already published. 


Contents XVil 


XXIV _ How to Turn a Sphere Inside Out* 545 
Stephen Smale 
XXV_ Eversion of the 2-sphere 546 


Bryce S. DeWitt 


* No manuscript has been submitted; only the title of the talk is given, with reference 
to material already published. 


Lie Groups and 
Symmetric Spaces 


SIGURDUR HELGASON 


General Notation Bi 
Chapter 1 Introduction 3 
1-1 Lie Groups 3 
-2 Symmetric Spaces 3 
3 . Non-Euclidean Fourier Analysis 7 
-4 Interpretation by Representation Theory Il 
5 The Eigenfunctions of the Laplacian on the Non-Euclidean 
Disk 13 
Chapter 2 Lie Groups and Lie Algebras 14 
2-1 The Lie Algebra of a Lie Group 14 
2-2 The Exponential Mapping 15 
Chapter 3 Structure Theory of Lie Groups 21 
3-1 Solvable and Semisimple Lie Algebras 21 
3-2 Structure of Semisimple Lie Algebras 24 
3-3 Cartan Decompositions 28 
3-4 Discussion of Symmetric Spaces 33 
3-5 The Iwasawa Decomposition 35 
3-6 The Weyl Group 38 
3-7 Boundary and Polar Coordinates on the Symmetric Space G/K 39 
Chapter 4 Functions on Symmetric Spaces 40 
4-1 Invariant Differential Operators 40 
4-2 Harmonic Functions on Symmetric Spaces 42 
4-3 Spherical Functions on Symmetric Spaces 52 
4-4 Fourier Transforms on Symmetric Spaces 58 
4-5 Interpretation by Representation Theory; Eigenfunctions of the 
Invariant Differential Operators 61 
4-6 Invariant Differential Equations on Symmetric Spaces 62 
4-7 The Wave Equation on Symmetric Spaces 66 
References 68 


I 
|- 
I 
I 


2 SIGURDUR HELGASON 


The purpose of these lectures is to give an account of the theory of those 
Lie groups which have played a particular role in geometry and in physics— 
the so-called semisimple Lie groups. Associated with these groups are the 
symmetric spaces, whose theory is a kind of an intersection of Riemannian 
geometry and Lie group theory. 

The primary prerequisites for reading these notes are some familiarity 
with the elements of the theory of topological groups and differentiable mani- 
folds. The emphasis is on noncompact semisimple Lie groups and the asso- 
ciated (noncompact) symmetric spaces. The function theory on these spaces 
is treated in a relatively detailed manner; however the holomorphic function 
theory is omitted altogether. 

Although the definitions and theorems are usually stated in full gener- 
ality, complete proofs are given only if they are either very short or particu- 
larly instructive. Verification for a special case is a frequent substitute for a 
proof. A study of special cases is in fact very important for understanding of 
Lie theory. With this in mind, Chapter 1 is devoted to the special group 
G = SU(1, 1) and the associated symmetric space, the non-Euclidean disk. 
Chapters 2 and 3 deal with selected topics from the classical theory of Lie 
groups and symmetric spaces. The results in Chapter 4 are of more recent 
vintage but almost all of them have been published elsewhere. The only 
exceptions are the integral representation of the eigenfunctions of the Lapla- 
cian on the non-Euclidean disk (Theorem 5.1, Ch. 1) and the extension of 
Fatou’s theorem to harmonic functions on symmetric spaces (Theorem 2.12, 
Ch. 4) proved by A. Kordnyi and the author. 

I am indebted to members of the Summer Rencontre for helpful discus- 
sions during the writing of these notes, particularly B. Carter, Y. Choquet- 
Bruhat, and L. Ehrenpreis. 


GENERAL NOTATION 


We list here some standard notation which will be utilized throughout 
the lectures. The symbols R, C, and Z refer to the real numbers, the complex 
numbers, and the integers, respectively. The nonnegative reals are denoted 
by R* and the nonnegative integers by Z*. The conjugate of a complex 
number c is denoted by ¢. The empty set is denoted by @. If Xis a set and 
x € X then the subset of X consisting of x alone is denoted by {x}. 

If M is a manifold, the set of complex-valued indefinitely differentiable 
functions on M is denoted C*(M). The set of functions fe C°(M) of com- 
pact support is denoted C,*(M). If pe M the tangent space to M at p is 
denoted by M,. Let M and N be manifolds and ¢: M— WN a differentiable 
mapping. The differential of ¢ at a point p € M, denoted d@,, or just @, is 
a mapping of M, into Ng») defined by dp, (X)(f) = X(fo #) if X is any vector 


Lie Groups and Symmetric Spaces 3 


in M, and fany function in C”(N). Ift— y(t) is any curve in M with tangent 
vector X at the point p then dd,(X) is the tangent vector to the curve 
t— d(y(t)) at d(p). The differentiable map ¢: M—N is called a diffeo- 
morphism if it is a one-to-one map of M onto WN and if the inverse map 
o~1:N-—M is differentiable. 


CHAPTER 1: INTRODUCTION 


1-1 Lie Groups 


A Lie group is a group G which ts also an analytic manifold such that 
the mapping (g, h) > gh™' of the product manifold G x G>G is analytic. 

Roughly speaking, this means that, at least locally, a Lie group is 
parametrized by an n-tuple of real numbers such that the group operations 
are expressed by analytic functions in these parameters. This makes it 
possible to study these groups by analytical methods. 

Lie group theory can be traced back to Sophus Lie’s applications of 
group theory to geometric situations as well as to his desire to obtain a 
theory of differential equations which paralleled Galois’ theory for algebraic 
equations. Since groups at that time were usually viewed as permutation 
groups, the geometric problems led naturally to the consideration of trans- 
formation groups with certain invariance properties. These invariance 
properties often give rise to a parametrization of the group, turning it into a 
Lie group. 


Example 


Let G denote the group of transformations of the plane R? preserving 
distance as well as orientation. Ifg e G let (x(g), (g)) denote the coordinates 
of g- 0 (0 is the origin in R*) and @(g) the angle from the x axis / to the line 
g'l. The parametrization 


g > (x(g), ¥(g), 9(g)) 


turns G into a Lie group. 


1-2 Symmetric Spaces 


Let M be a C® manifold. A Riemannian structure on M is a positive 
definite inner product <¢ , > on the tangent space M, at an arbitrary point 
péeM. Itis assumed that if X, Y are C® vector fields on M then the function 
p—><X,,Y,> is a C® function on M. A manifold with a Riemannian 
structure is called a Riemannian manifold. 


4 SIGURDUR HELGASON 


Example 


The following example is of basic importance and will accompany us 
throughout these lectures. 

Let D be the open unit disk |z| < 1 in R* with the usual manifold struc- 
ture but given the following Riemannian structure: If u, v are tangent vectors 
at the point ze D, put 


_ (u,v) 
OT EeP ) 
(, ) denoting the usual inner product on R?. Since 
cu, v»? = (u, v)? 
<u, u><v, v) — (u, u)(v, 0) 


the angle between u and v in the new Riemannian structure coincides with the 
Euclidean angle. 

The length of a curve y(t) (a <t< B) on a Riemannian manifold is 
defined by 


B 
L(y) = [ <y'), xO"? at 
and the distance between two points p, ge M is defined by 
d(p, q) = inf L(y) 
Y 


the infimum taken over all curves joining p and g. In our case if y(t) = 
(x(t), y(t)) and s(t) is the arc-length of the segment y(t) (0 < t < Tt), we get 


(a) ~ amerrverr la) + (Z| 


In classical terminology this is written 
_ dx? + dy’ 

[1 — (x? + y’)]? 
In particular, if y(~) = 0, y(B) = x (point on the x axis) and we denote by yp 
the line segment from 0 to x, we get from 


x(x + 
[1 — x(t)?]? © {1 — [x(2)? + y(t)?}? 


ds? (2) 


the inequality 
L(yo) < L(y) 


Lie Groups and Symmetric Spaces 5 
Thus 


1+ |z| 


1 
d(0, z) = ~ log ——— 
(0, 2) = 5 log (3) 
and the straight lines through the origin are geodesics. 
Let us now determine the group /(D) of all isometrics on D. If a, be C 


then the transformation 


az+b 


bz+a 


giz-> |a|* — |b]? = 1 (4) 


maps D onto itself. Let us verify that g preserves the Riemannian structure 
(1): Let z(t) be a curve with z(0) = z, 20) =u. Then 
d z’(0) 
“u={— t = th tor —-—_5 alg * 
g:u {5 az 1} e vector (bz + ay? atg°z 
and the relation 
(g'u,g°upy = Cu, u) 


follows immediately. Now if 4 € ((D) is arbitrary, there exists a g as in (4) 
such that gh~! leaves the x axis pointwise fixed. But then gh™! is either the 
identity or the conjugation zz. Thus J(D) is generated by the transfor- 
mation (4) and the conjugation c:z—Z. Denoting as usual 


SU(L, 1) = (5 f 


and by J the identity matrix, we have 


la|? — |b]? = 1 


I(D) = (SU(A, 1)/ + DU c(SU(1, 1)/ + I) 


In particular, J(D) is a Lie group (a fact which was proved for all Riemann 
manifolds in Myers and Steenrod [55]). 

Since the group of transformations (4) is transitive on D we deduce that 
the geodesics in D are the circular arcs perpendicular to the boundary |z| = 1. 
Since the expression for d(0, z) can be written by means of the cross ratio 


C= Z/iz|\ 2 = al 
0+ z/|z| z+ 2z/|z| 


d(0, z) = ; log ( 


and since the cross ratio is invariant under fractional linear transformations 
we obtain 


Z,—b, 2,-—b, 


1 
ie;2s 5 log ( ) ee (5) 


Z,— 5, Z2 — by 


6 SIGURDUR HELGASON 


(b,, 6, being shown in Fig. 1). But the space D with this distance d is of course 
the classical Poincaré model of non-Euclidean geometry. 


Definition. A Riemannian manifold M is called symmetric (or globally 
symmetric) in the sense of E. Cartan if for each pe M there is an isometry 
s, of M onto itself which reverses the geodesics through p (s, is called the 
geodesic symmetry with respect to p). 


b 


FIGURE 1 


Since the symmetry sy : z > —z is of the form (4) it is an isometry of D. 
If ge 1(D), then the isometry gsog~* reverses the geodesics through g - 0; 
I(D) being transitive, D is therefore symmetric. 

Let y(t) (—0 <t< oo) be a geodesic in a symmetric space M, let 
S; = Sy), and let t, denote the Levi—Civita parallel transport along y from 0 
tot. IfL isa tangent vector to M at (t), then since s> preserves parallelism 
and so(t_, L) = —t_,L we see that s)(L) = —t_,,L. Consequently, the 
isometry 7, = 5,;2 59 realizes the parallelism from 0 to ¢t along y. The isom- 
etries JT, actually form a one-parameter group—the group of transvections 
along the geodesic y. 

Let M be a Riemannian manifold, (U, @) a local coordinate system and 
o(q) = (x,,...,X,) forge U. We put 


nd ((f) (2) 


g = det (g;;), 
g =(g;;)' 


Lie Groups and Symmetric Spaces 7 


Then we can define a measure sk on U by 


uC) = i, : ./9 dx, ++: dx, (6) 
(C) 


(where we have written /g for /g o@ '). This definition is invariant under 
coordinate changes and defines a measure on M, the Riemannian measure. 
Somewhat imprecisely (M is not necessarily orientable) one refers to /9 UN itses 
dx, aS the volume element on M. 

We also recall the oo operator defined for fe C*(U) by 


= = (¥ g\/ =) (7) 


Again, the expression on the right can be shown to be invariant under coor- 
dinate changes and so defines a differential operator on M. 
In the case of D we find at once from (1), 


gi; =(1 —|z|?]°76;;  (6,; = Kronecker delta) 


A: f>~— 


g” =[I - |z|7]76,; g=(1- |z|*)~4 
The volume element is therefore given by 
[1 —(x* + y’)]"? dx dy (8) 
and the Laplace—Beltrami operator is 
0? 6? 
A=[l—(x? + rP(S + =) (9) 


1-3 Non-Euclidean Fourier Analysis 


We shall now define a Fourier transform on the non-Euclidean disk D. 
First we recall the Fourier inversion formula on R". For fe L'(R") put 


flu) = | f(x)e7'"") dx (1) 
R" 
(, ) denoting the usual inner product on R". Then if fe C.*(R"), 
f(x) = (2n)™ a f(we" du (2) 


Let us introduce polar coordinates u = Aw, 1 >0, and w is a unit vector. 
Then (1) and (2) become 


f(a) = [_ fede " dx (3) 


f(x) = (2n)™" I. fee J (Awe) 28-1 dd dw (4) 


8 SIGURDUR HELGASON 


where R* = {Ae R|A > 0} and dw is the volume element on the unit sphere 
sy. 

Because the functions e, : x > ce’) are characters of the group R”, the 
Fourier transform (1) can be generalized to locally compact Abelian groups. 
Since D is not a group this viewpoint is not directly applicable here. How- 
ever, the functions e, have the following properties: 


(i) e, is an eigenfunction of the Laplace operator on R"; 
(11) e, 1S Constant on each hyperplane perpendicular to uw (“‘ plane wave”’ 
with normal zu). 


These properties essentially characterize the exponentials and since they are 
geometric properties we shall see that they have analogs for the space D. 


FIGURE 2 


Parallel geodesics in D are by definition geodesics corresponding to 
the same point 6 on the boundary B of D. A horocycle with normal b is by 
definition an orthogonal trajectory to the family of all parallel geodesics 
corresponding to b. Thus a horocycle in D is the non-Euclidean analog of a 
hyperplane in R”. Since the inner product (x, w) in (3) 1s the distance from 
the origin to the hyperplane with normal w passing through x we define 
<z, b> for ze D, be B, as the non-Euclidean distance from 0 to the horocycle 
E(z, b) with normal b, passing through z. (Here <z, 5> is taken negative in 
case 0 falls inside the horocycle.) 

For nw EC, be B we consider the function 


enn: 270 zED (5) 


Lie Groups and Symmetric Spaces 9 


These formal analogs of the exponential functions on R” are also conceptual 
analogs for they satisfy the following non-Euclidean counterparts to (i) and 


(il): 

(i)’ e, , is an eigenfunction of the Laplace—Beltrami operator on D (for 
example, use (9) in §1-2 and (11) below); 

(ii)’ e, , 1S Constant on each horocycle with normal b. 


Consequently, we define Fourier analysis on D to be decomposition of 
‘arbitrary’ functions into functions e, , in (5). 


Theorem 3.1. For fe C.~(D) set 
f(A, b) = [aera ane dz ié€R,beB 
where dz is the volume element on D. Then 
f(z) = (2n)7? I, is F(A, byei4* >, tanh (4nd) dd db (6) 
where db is the usual angular measure on B. 
We shall now indicate how (6) follows from classical facts. Denote the 


measure (27)~ 7A tanh (472A) dA db by dy(A, 6) and define the operators T and 
S by 


(Tf)(A, b) = f(A, b) fEec,“(X) 
(SF\(z)= | F(A, bye? * 9” dy(A, b) 
RxB 
the function F restricted such that the integral converges absolutely. Then 
[ SOSPOd2 = | (TAA, BFC, 6) du(d,b) 
and by iteration 
| f@STa@) dz = | (STAVZQG@ dz fg €C.*(D) (7) 
because 7 and Tg satisfy the growth restrictions placed on F. 
Lemma 3.2. Let t be an isometry of D and if g is a function on D, put g*(z) = 
g(t '+ z). Then 
STf* =(STf) for f € C.~(D) 


PROOF. Since t preserves the volume element on D, 


Fi, dy= J f@eUMen” de (8) 


10 SIGURDUR HELGASON 


But the isometry t extends in an obvious way to the boundary B (cf. (4) §1-2), 


and we have 
(t-z,t7' b> =<z,b>+<t:0,7:° b> (9) 


This identity is easily seen by observing that the horocycles €(t- 0, t: 6) and 
E(t: z, t: 5) cut segments of equal length off the parallel geodesics (0, t- b) 
and (t-0, 7-5). Thus 

<(t:z, b> =<z,t 1+ b> +<t-0, bd 
so (8) becomes _ 

f(A, b) = eFC 0, bP Fe). qi. b) 
SO 


(STf*\(z) = |, fa, a7! + peli 1, bd git 102, > dy (JQ, ) 


Now we change variables; the Jacobian of the mapping b— 7: b satisfies 


d(t: b) 


7 = e2t'OD>  =beB (10) 


In order to verify this observe that tT = k,oK,, where k,, K, are rotations around 
0, and o maps the x axis onto itself. We can thus assume t of the form 


_ (cosh t)z + sinh t 
~ (sinh t)z + cosh t 


so if b=e'%, the left-hand side of (10) equals (cosh 2 + sinh 2t cos ¢)7~?. 
On the other hand, a simple computation using (3) §1-2 shows that if z = 
\z|e®, b = e'® then 

1 — |2/? 


1 — 2|z|cos (6 — ¢) + |z|” 


262d) — 


(11) 
so, in particular, (10) follows. Using also <t~': 0, b>) = —<t: 0, t: bd 
[which follows from (9)], we obtain 


(STf*)(z) = if JG, b)e' —-iA-1)<t-0, tb) o(ia + 1)<z, t-b> dA, b) 


which again by (9) equals (STf)(t~'- z), proving the lemma. 

In order to prove Theorem 3.1, that f= STf, it suffices, by (7), to prove 
this for a sequence (/,) where f, — 6,, the delta function at an arbitrary point 
zéD. By Lemma 3.2 we can assume that z is the originin D. But then the 
functions f,, could be taken to be radial functions. But if f(z) = F(d(0, z)), 
Fe C,~(R) (F even), then f(A, 5) is an even function F(A) of A alone. If 
r = d(0, z) then 

z = |z| e® = (tanh r)e” 


Lie Groups and Symmetric Spaces 1] 


In the coordinates (r, 6) the volume element (8) §1-2 becomes 
dz = 4 sinh 2r dr dO 


If we now consider the Legendre function 
1 22 
P,(cosh r) = — | (cosh r + sinh rcos6)"d@  (veC) 
2n 0) 
the formulas in Theorem 3.1 become 


FQ)=n [ For. 4441 Cosh (2r) sinh (2r) dr (12) 


F(r) = - [ FOP 4-40 cosh (2r) A tanh (427A) dd (13) 


After a harmless change of variables, (13) becomes simply the inversion form- 
ula for the Mehler transform (Erdélyi [17], Vol. I, p. 175, Fok [18] and Gode- 
ment [22a]). Assuming this inversion formula, Theorem 3.1 is proved (cf. 
Helgason [35], [36]). 

If we compare the formulas in Theorem 3.1 with (3) and (4) we note 
a factor e?‘**”? which has no analog in the Euclidean case. But according to 
(11) this factor is just the classical Poisson kernel but expressed in non-Euclid- 
ean terms. Consequently, the classical Poisson integral formula for a 
harmonic function u on D with continuous boundary values f(b) on B, 
2n 1—r? 


iw _ 1 ip 
ae =o 0 {oes @ =e) are mee 


can be written 
1 2n 
u(z) = — | e2<* DF (b) db (14) 
2n 0 


According to our stated conventions this is a formula in Fourier analysis 
on D. 

Note that the Euclidean harmonic functions coincide with the non- 
Euclidean harmonic functions according to (9) §1-2. Thus (14) is entirely 
non-Euclidean. 


1-4 Interpretation by Representation Theory 


Let X be a space with a measure yp and let G be a transformation group 
of X leaving the measure yp invariant. To each g € G we associate the opera- 
tor T(g):f—f*% on the space L?(X) of square-integrable functions on X. 
(As in Lemma 3.2, f? denotes the function x > f(g~!- x) on X.) Then the 
mapping g > 7(g) is a unitary representation of G on the Hilbert space L?(X). 
Now arises the natural problem of decomposing this unitary representation T 


12 SIGURDUR HELGASON 


into irreducible representations T, acting on Hilbert spaces §, such that for 
a suitable measure v 


LX) = [9, da) T = [ T, dv(a) (1) 


in the sense of direct integrals of Hilbert spaces (see, for example, Dixmier 
[15]). In §1-3 we have some examples of (1): 

a. First let G denote the group of translations of R”. Then for each 
ue R" the space §, = Ce, is invariant and irreducible under G; let 7,, denote 
the representation of G on §, given by 


[Tig f(x) = f(g *x) ~— for fe $,,g €G, x eR". 
Then (2) in §1-3 (together with the Plancherel formula 


| LfG)? dx = (2x) [FP du) 
can be written 
12(R") = | §,du* T= | T, du* (2) 


where du* = (27)~" du. 

b. Next let G denote the group of all transformations of R” preserving 
orientation and distance. For each A € R* consider the Hilbert space of func- 
tions on R” given by 


5.=[FC)= [°F @) do| Fe LS" (3) 


(defining ||F,|| as the L? norm of F) and let 7, denote the representation of G 
on §, given by 


(T,(g)F (x) = F,(g~*x) F,E9,,9€G,xeER" 


T, is in fact a unitary representation, because if g = tk (¢ is the translation, 
k the rotation around 0), then 


(T,(g)F, M(x) = | et eH E(k 1) do (4) 
gn-1 

and 7, is in fact irreducible (cf. It6 [42] and Mackey [51, §14]) and different 

4 in R* give inequivalent 7,. Thus (4) in §1-3 together with the Plancherel 

formula 


[ Lf@)P dx = (2m) | 


R+xS 


\F(Aw)|24"~} da dow 


Lie Groups and Symmetric Spaces 13 


gives the direct integral decomposition 
L(x) = { §,d* T= T, di* (5) 
R+ R+ 


where dA* = (22)~"A"~! dd. 
c. Finally, we consider the case when G is the group SU(1, 1) operating 
on D. For each A€ R consider the Hilbert space 


$, = (2) = jee 1X2 A(b) db | he LB) 


(defining ||/,|| as the L? norm of /) and let 7, denote the representation of G 
on §, given by 


[T,(g)h,\(z) = hg 'z) 
Using formulas (9) and (10) in §1-3, we find 


h(g71-z)= | olia+ 1)<2, 9-4 + 1)49°0, BY Ag ~ 1 - b) db 
B 


[compare with (4)]; so using (10) again we see that 7, is unitary; comparing 
with Bargmann [1], Thm. 1, p. 613, we see that 7, is irreducible. Finally (6) 
in §1-3 and the Plancherel formula 


[ ¢@Paz= | IFA, bP? dua, b) 
D RXxB 


show that 
LD)=|  9,dua) T=] T, dud) (6) 
R/Z2 R/Z2 


where du(A) = 2(2n)~*A tanh (47A) and integration is taken over R/Z, since 
T, and 7, can be shown equivalent if and only if A= —uy. 


1-5 The Eigenfunctions of the Laplacian on the Non- 
Euclidean Disk 


Let P(z, b) denote the Poisson kernel 
1 —|z|? 
1 —2|z| cos (0 — d) + |z|” 


z=|zje® b=e 


P(z, b) = 


If 2 EC is any complex number it is clear from (i)’ and (11) in §1-3 that for 
for each b € B the power P(z, b]’ gives an eigenfunction of the non-Euclidean 
Laplacian A. A direct computation gives 


A,(P(z, b)*) = 44(4 — 1)P(z, 6)’ 


14 SIGURDUR HELGASON 


which shows that the eigenvalue is independent of b. Note that the eigen- 
value is > —1 (and real) ifand only if A € R. Weshall now consider the prob- 
lem of constructing the most general eigenfunctions of A. 

Let A(B) denote the set of analytic function on the boundary B, considered 
as an analytic manifold. The space A(B) carries an atural topology (see, 
for example, K6the [48]). The continuous linear functions A(B)—>C are 
called analytic functionals on B; they constitute the dual space A’(B) of 


A(B). If Te AB), fe A(B) we write for T(f) also | , f(b) dT(b), since the 


elements of A are generalizations of measures. For the eigenfunctions of A 
we have the following result (unpublished): 


Theorem 5.1. The functions 
F(z) = | P(z, b)* dT(b) 
B 


where Ae R and T is an analytic functional on B constitute precisely the 
eigenfunctions of A with eigenvalue > —1. 


CHAPTER 2: LIE GROUPS AND LIE ALGEBRAS 


2-1 The Lie Algebra of a Lie Group 


Let M be a manifold, p a point in M, and M, the tangent space to M at 
p; this is a vector space over R. In differential geometry one studies a mani- 
fold by means of its family of tangent spaces to which numerous objects are 
associated (vector fields, differential forms, arbitrary tensor fields). 

If G is a Lie group, the tangent space G, at an arbitrary point ge G is 
obtained from G, (e is the identity element) by the left translation L, : x > 
gx (x eG), that is, GJ =dL,(G,). This circumstance makes it possible to 
introduce an additional structure on G, as follows: 


Let X¥, YeEG,. Then we obtain vector fields X, Yon G by left trans- 
lations: 


X,=dL(X) Y,=dL,(Y) geG 


The bracket LX, Y] = X Y— YX is another vector field on G which is invari- 
ant under left translations so there exists a unique vector Z eG, such that 


[X, Y].= Z 


We write [X, Y] instead of Z. The vector space G, with the rule of compo- 
sition (X, Y) > LX, Y] is called the Lie algebra of G and will be denoted by gq. 


Lie Groups and Symmetric Spaces 15 
The bracket [, ] has the following properties: 
(a) [X,Y] = —-[Y, X] 
(b) [X, LY, Z]] + [Y, [Z, X]] + [Z, [X, Yl] =0 


A vector space a with a bilinear map (X, Y) > [X, Y] of a x a into a satisfying 
(a) and (b) above is called a Lie algebra. For Lie algebras one can in an 
obvious manner define subalgebras, ideals, homomorphisms, isomorphisms, 
and automorphisms. 

If V is a vector space let gl(V) denote the vector space of all linear trans- 
formations of V into V, with the bracket operation [A, B] = AB — BA. 
Then gI(V) is a Lie algebra. A homomorphism of a Lie algebra a into gI(V) 
is called a representation of a on V. In particular, if for a given X ea, the 
mapping Y — [X, Y] is denoted ad X, the mapping ad: X— ad X is a rep- 
resentation of aona. The kernel of ad is called the center of a; a is called 
Abelian if its center is a; that is, if [X, Y] =0 for all X, Yea. 


2-2 The Exponential Mapping 


Let G be a Lie group with Lie algebra g. Let X eq and let X be the 
left invariant vector field on G such that X,= X. Let ¢(t) (te R) be the 
integral curve to X passing through e, that ts, 


d i 
b= d0(F) = Koy 40) =e (1 


For small t, d(t) exists and is unique because (1) 1s a first-order system of 
ordinary differential equations. For the global statement one uses the group 
property to continue the solution. The mapping exp: g—G is now defined 
by 

exp X = (1) 


and is called the exponential mapping. It sets up a very far-reaching rela- 
tionship between g and G; some of the main results will be summarized below. 
First we have 


(i) exp sX exp tX = exp (s+ t)X (s, t € R) 


that is, the curve ft exp /¢X is a one-parameter subgroup of G. In fact, if 
sé R, then L.,,,x maps X into itself so it maps the integral curve through e 
into the integral curve through expsX. Thus, L,,,.x ((t)) = O(s + t) which 
is (i). 

By the definition of X, 


zs d 
R,F=(F sexx} fecr(G), geo 


t= 


16 SIGURDUR HELGASON 


Thus the value of the function Xf at g exp sX is 
me d d 
(Xf)(g exp sX) = \— f(g exp sX exptX)) =— f(g exp sX) 
at t=0 ds 
and by induction, if ne Z*, 
= d" 
(X"f)(g exp sX) = as S(g exp sX) (2) 


(ii) If a function f is analytic in a neighborhood of a point g € G, then 


of . 
f(g exp X) = » — 1 (XAG) (3) 


for all X in some neighborhood of 0 in g. 
This relation follows by using (2) in Taylor’s formula for the function 


s—f(g exp sX). 


(iii) The mapping ¥ > exp X is a diffeomorphism of an open neigh- 
borhood of 0 in g onto an open neighborhood of e in G. 

This is a direct consequence of the fact that the mapping X¥ > exp X¥ 
has Jacobian #0 at the origin X = 0. 


(iv) If X¥, Yeg then 
exp tX exp tY = exp {t(X + Y) + 40°LX, Y] + 0(t°)} (4) 


where 0(t*) denotes a vector such that t~? 0(t?) is bounded near t = 0. 
In fact, by (iil), we have for small ¢, 


exp tX exp tY = exp Z(t) (5) 
where t > Z(f) is a curve in g, analytic at t = 0 and 
Z(t)=1Z,+0t?Z,+0(t?) (Z,,Z,€9) 


But by (2) and (3) we have for f analytic at e, 


mtn 


f(exp tX exptY)= ¥ (X"Y"f)(e) 


n20 m!n! 
whereas 
ae as ss 
Fexp 2M) = 2, [ty + Zp + OP NICE) 


Comparing coefficients we get Z, = X + Y, 4Z,7+ Z, =4X?7+XYV+4+4Y7?, 
whence Z, = 4X, Y], proving (4). 


Lie Groups and Symmetric Spaces 17 


From (4) we deduce that 
exp (—tX) exp (—tY) exp tX exp tY = exp {t?[X, Y] + 0(t)} 


which shows that LX, Y] is the tangent vector at e to the curve 


t > exp (—,/t X) exp (—,/t Y) exp (./t X) exp (./t Y) 


(v) Two Lie groups are locally isomorphic if and only if their Lie alge- 
bras are isomorphic. 


The ‘‘only if’? part is immediate from (4). On the other hand, it is 
possible to carry further the computation above and express Z(t) in (5) 
completely in terms of t, X, Y and their repeated brackets. (The resulting 
formula is the so-called Campbell-Hausdorff formula, see for example, 
Jacobson [43].) The ‘if’ part of (v) is an immediate consequence. 


A Fundamental Example 


Let GL(n, R) denote the group of real n x n matrices of determinant #0 
and glI(n, R) the Lie algebra of all real n x n matrices, the bracket being 
[A, B] = AB— BA. If o =(x;,)) is a matrix in GL(n, R) we consider the 
matrix elements x;(¢) as coordinates of o whereby GL(n, R) is a manifold; if 
we express x;(ot +) (o, t€ GL(n, R)) in terms of x,,(c), x,,(t) by ordinary 
matrix multiplication we see that GL(n, R) is a Lie group. Let g denote its 
Lie algebra and if X € g let X denote the left invariant vector field on GL(n, R) 
satisfying X,= X. Let (X;;) denote the matrix (X, x;;) and consider the 
mapping @: X>(X;,;) of g into gl(n,R). The mapping ¢ ts linear, one-to- 
one and onto. Furthermore if L, denotes the left translation t — ot we have 
by the left invariance of X, 


(Xx;;)(¢) = X (xi; o L,) 
But 


(x;; 0 L,)(t) = x;(ot) = » Xig(O)Xy (7) 
Ye) 
(Xx;)(0) = = Xi_(O)X bj (6) 
It follows that 
(XY — YX),x,, = ny (Xie Nag — Ye Xe) = LO(X), OI 


SO @ is a Lie algebra isomorphism. Thus the Lie algebra of GL(n, R) is 


18 SIGURDUR HELGASON 


identified with gI(n, R). In this statement one can replace the real field R by 
the field C. In view of (2) and (6) we have 


d 
an j(exp tX) = )) xn (exp tX)X,; 
k 
so the matrix function Y(t) = exp tX satisfies 
d 
ai Y(t)= Y(t)X Y(O)=I1 (7) 


But this equation is also satisfied by the matrix exponential function 
X= 14tX +4°X7? +--- 


so exp X =e* for all Xegl(n, R). Thus the exponential mapping for Lie 
groups generalizes the exponential function for matrices. 

Let G be any Lie group. A Lie group H is called a Lie subgroup of G 
if it is a subgroup of G and a submanifold of G. If this is the case the Lie 
algebra h of H is a subalgebra of the Lie algebra g of G, and the exponential 
maps for 5 and g coincide on b. 


(vi) Let G be a Lie group with Lie algebrag. Leth < g be asubalgebra. 
Then there exists exactly one connected Lie subgroup H of G with Lie alge- 
bra b. 


This important fact 1s proved along the following lines: Consider the 
(abstract) subgroup H of G generated by the set exp 5. Using (ili), one 
introduces a topology in H (this is not necessarily the relative topology of G) 
as well as a coordinate system near the identity of H. By left translations on 
H this gives a coordinate system in some neighborhood of an arbitrary point 
of H and one must finally prove that this manifold structure on H has the 
required properties. A connected Lie subgroup is usually called analytic 
subgroup. 


(vii) Let G be a Lie group and H a subgroup of G which is closed as a 
subset of G. Then there exists a unique manifold structure on H such that 
H is a topological Lie subgroup of G. 

If h and g are the respective Lie algebras of H and G then 


h = {X eg] exp tX EH for all te R} (8) 


Example 


Let us use (8) to find the Lie algebra of the group SU(1, 1) considered 
in Chapter 1. First note that SU(1, 1) is the group of matrices of determinant 


Lie Groups and Symmetric Spaces 19 


| leaving invariant the Hermitian form —z,Z, + z,Z,, that is, a matrix A 
belongs to SU(1, 1) if and only if 


‘AJA = J, det A=1 
where ‘A is the transpose of A and 
t= (79 4) 
Since gI(2, C) is the Lie algebra of GL(2, C) we see that X belongs to the Lie 
algebra su(1, 1) of SUC, 1) if and only if 
‘(exp sX)J exp sX =J det (exp sX) = 1 (s € R) 
But exp (‘X) = ‘(exp X), so the first relation can be written 
exp sX = Jexp(—s'X)J7! 
=exps(—J'XJ~') (se R) 


Thus X¥ esu(1, 1) if and only if X = —J'XJ~! and Trace X¥=0. This is 
equivalent to 


su(1, 1) = {x = (5 B ) 


—ia 


weR, Bec! 


Property (v) shows that local properties of a Lie group are completely 
determined by the Lie algebra. This is of great consequence because all the 
machinery of linear algebra (theory of linear transformations of a vector 
space) can be applied to Lie algebras. In particular, let us see how the left 
invariant Haar measure on a Lie group can be written in Lie algebra terms. 

Consider a Lie group G with Lie algebra g. If X €g the differential 
of the exponential map at X maps the tangent space gy onto the tangent 
space G.,,x, which is dL,,, x(g) (since g = G,). We identify g, with g via the 
ordinary parallelism of vectors. Thus if Y eg there exists a unique vector 
Z €g such that 


d exp x( Y) = (dL.xp x)(Z) 


Let us compute Z. By the definition of the differential of a map we have if 
J is differentiable at exp _X, 


dexp x(Y)f = Yy(fo exp) (9) 


Where Yj is the vector Y viewed as a tangent vector to g at X. But 


d 
Ya(S exp) = {5 flexp (X +1¥))| (10) 


t= 


20 SIGURDUR HELGASON 


Now take f to be analytic at e. Then if X and ¢ are sufficiently small, 
f(exp(X + tY)) = y =F 1X + tY)"f](e) 
so by (9) and (10), 


dexp x(Y)f= y 


= aT (RP + RPK +--+ PRTC) 


Now consider the algebra generated by the left invariant vector fields on G 
and the operators 


L(X):A>2XA = R(X): A> AX = (KX): A3 XA -— AX 
of this algebra. Then 0(X) = L(X) — R(X) and L(X) and 0(X) commute so 


R(X" = (L(X) — 0(%))" = p3 (1 (" \uzy rok)? 
and 
RP py 4 PRHY Rr (—}(" " P) Reet RMP) 
= =0 


which by the elementary formula 
na" (n—p n+] 
x ( k y=(073) 


§ (tt amucav 


equals 


Hence, 
[= xr- k 0(— 0(— X)* 
(n —k)'(k + 1)! 


For sufficiently small X one can use the analyticity of f to interchange the 
two summations and use the formula 


dexp (Y)f= ¥ » 


n=0 Lk=0 


MnO an 


to equate the right-hand side with 


c o( — 
PS BE 5 + 1)! 


WAG (12) 


which by (3) equals 


© To(—X)* 
¥ Ps | (exw 9 


Lie Groups and Symmetric Spaces 21 
But 0(—X)*( Y) is the left invariant vector field corresponding to the vector 
ad(— X)*(Y) in g so we have proved 


d exp x(Y) = dL, x( (Y) (13) 


ad X 
at least if X is sufficiently small. Because of the analyticity of both sides (13) 
holds actually for all X € g. 
Note that in the right-hand side of (13), 


1 — ead X 


ad X 


Now let exp: Vo > V, be a diffeomorphism, V, and V, being open sets 
in g and G, respectively. Let fe C.~(G) have support contained in V,. If 
dx denotes a left invariant Haar measure on G we have 


} f(x) dx = | f (exp X)J(X) dX 


dX being a Euclidean volume element on g and J the Jacobian of the expon- 
ential map. In view of (13) we have 


1 
= | en tadX ay 
0 


ad X 


where c is a constant. For a formulation of (13) for differential forms see 
[11] p. 21 and [12] p. 157. For a generalization to Riemannian manifolds 
see [30]. 


| $@ dx = c| S(exp X) det (12 dX (14) 


(ix) Given a Lie algebra g over R there exists a Lie group G with Lie 
algebra g. 


The local result is called the third fundamental theorem of Lie; the 
global statement was later proved by E. Cartan. One proof of (ix) uses Ado’s 
theorem that there exists an isomorphism of g into gI(n, R). Then the desired 
G can by (vi) be taken as a suitable subgroup of GL(”, R). Another proof 
will be indicated later. 


CHAPTER 3: STRUCTURE THEORY OF LIE GROUPS 


3-1 Solvable and Semisimple Lie Algebras 


Let g be a Lie algebra and as before let ad X denote the linear transfor- 
mation Y—> [X, Y] of g. Lie algebra theory is concerned with this family of 
linear transformations. 


22 SIGURDUR HELGASON 


The vector space spanned by all elements LX, Y] is an ideal in g, called 
the derived algebra of g and denoted Dg. The nth derived algebra D"g of g 
is defined inductively by D°g = 9, D"g = D(D"~'g). A Lie algebra is called 
solvable if D"g = {0} for some n>0. A Lie group is called solvable if its 
Lie algebra is solvable. 

A Lie algebra is called nilpotent if for each X € g, ad X is nilpotent. It 
can be proved that a Lie algebra is solvable if and only if its derived algebra 
is nilpotent. In particular we see that a nilpotent Lie algebra 1s solvable. 


Example 


Let t(n) denote the Lie subalgebra of gl(n, R) formed by the upper 
triangular matrices and let n(n) denote the subalgebra of matrices in t(m) with 
diagonal0. Then t(n) is solvable, n(7) nilpotent and coincides with the derived 
algebra of t(n). 

Let g be a Lie algebra. The Killing form of g is defined as the bilinear 
form B(X, Y)=Tr (ad X ad Y) (Tr = trace); g is called semisimple if B is 
nondegenerate and g 1s called simple if in addition it has no ideals except 0 
and g. 


Example 


Let SL(n, R) denote the group of n x n real matrices of determinant 1. 
It is a closed subgroup of GL(n, R), hence a Lie subgroup [cf. (vii) §2-2] and 
since the relation det (e4) = e™ 4 holds for any matrix A, we see from (8) 
§2-2 that the Lie algebra sl(n, R) of the subgroup SL(n, R) of GL(n, R) is the 
subalgebra of glI(n, R) consisting of all n x n matrices of trace 0. This 
statement holds also with R replaced with the complex field C. Let us 
compute the Killing form of sl(n, C). Let Dd(m) denote the set of diagonal 
matrices in sI(n, C). If H €D(n) each matrix £;,; with 1 at the ith row and the 
jth column, 0 elsewhere, 


0 0 0 
E;; = \ 
is an eigenvector for ad H and we find easily that 
Tr (ad H ad H) = 2n Tr (HA) (1) 


The mapping X¥ > gXg~' (g € GL(n, C)) is an automorphism of sI(n, C) and 
any automorphism of a Lie algebra leaves the Killing form invariant. If 
gXg_' €d(n) we have therefore 


Tr (ad X ad X) = Tr (ad (gXg~') ad (gXg~')) =2nTr(gXXg™") (2) 


Lie Groups and Symmetric Spaces 23 


SO 
Tr (ad X ad X) = 2n Tr (XX) (3) 


The matrices which are conjugate to a diagonal matrix in 0(m) form a dense 
subset of sl(n, C) so (3) holds for all X esl(n, C). Hence by “ polarization” 


BX, Y) = 2n Tr (XY) for X, Yesl(n, C) (4) 


It is a trivial matter to verify that B in (4) is nondegenerate so sl(n, C) is 
semisimple. 

A fundamental result in Lie algebra theory (the Levi decomposition) 
states that every Lie algebra g is the direct vector space sum 


g=r+s (5) 


where r is the maximal solvable ideal in g and s is a semisimple subalgebra. 
To a large extent this result splits Lie group theory into two branches—one 
for solvable Lie groups, the other for semisimple Lie groups. The latter 
branch is further developed and has had more contact with physics and 
geometry and is therefore emphasized in these lectures. (Of course the two 
branches are related because semisimple Lie algebras always have solvable 
subalgebras.) 

The Levi decomposition can for example be used as a basis of an alterna- 
tive proof of (ix) §2-2. Let Aut (s) denote the group of all automorphisms 
of s. This is a closed subgroup of GL(s), hence a Lie subgroup, and by (8) 
§2-2 its Lie algebra is given by the set of endomorphisms A of s for which 
e'4 « Aut (s) for all te R. But the relation 


e'“(X, Y] = [e'*X, e“Y] 
implies (by differentiation) 
A[X, Y] = [AX, Y] + [X, AY] (6) 


and vice versa. A linear transformation A satisfying (6) for all X, Yes is 
called a derivation of s so we see that the Lie algebra of Aut (s) is the set of 
derivations of s. On the other hand, if X es, ad X is obviously a derivation 
of s. Using the semisimplicity, one can prove that all derivations of s are 
of this form. Thus ad (s) is the Lie algebra of Aut (s); but the semisimplicity 
of s shows that X > ad X is an isomorphism so we have verified that any 
semisimple Lie algebra is the Lie algebra of a Lie group. For solvable Lie 
algebras the statement can be proved by induction and by the Levi decompo- 
Sition (5) the theorem can be proved in general by taking appropriate semi- 
direct products. 

For any Lie algebra g, let Int (g) denote the connected Lie subgroup of 
GL(g) with Lie algebra ad (g) < gl(g); Int (g) is called the adjoint group of g. 
If g is semisimple then Int (g) is the identity component of Aut (g). If Gis a 


24 SIGURDUR HELGASON 


Lie group with Lie algebra g, and g € G, the inner automorphism x > gxg7' 


of G induces an automorphism of 9, denoted Ad(g). If G is connected, 
Ad(G) = Int(g). In fact if X, Y eg we obtain by iterating (4) §2-2 


exp (Ad (exp tX)tY) = exp tX exp tY exp (—LX) (7) 
= exp (tY + t??LX, Y] + 0(t°)) 
SO 
Ad (exp tX) Y= Y + tLX, Y] + 0(t7) (8) 


On the other hand, the mapping g > Ad (g) is a homomorphism of G into 
GL(g). Hence t— Ad (exp LX) is a one-parameter subgroup of GL(g), thus 
by the fundamental example in Ch. 2 of the form 


Ad (exp tX) = e!4 
But then (8) shows A = ad X so 
Ad (exp X) = e*** (9) 


and the relation Ad (G) = Int (g) follows. 
The homomorphism g — Ad (g) is called the adjoint representation of 
G. For clarity it is sometimes written Adg. 


3-2 Structure of Semisimple Lie Algebras 


Let g be a semisimple Lie algebra, B its Killing form. If O(B) denotes 
the group of linear transformations of g leaving Binvariant, we have Aut (g) ¢ 
O(B); also 


B(X, ad Y(Z)) = — B(ad Y(X), Z) 
for X, Y, Z €g, so each ad Y is skew-symmetric with respect to B. 


Definition. A Lie algebra g over R is called compact if its adjoint group 
Int (g) is compact. 


Proposition 2.1 


(i) Let g be a semisimple Lie algebra over R. Then g is compact if 
and only if the Killing form of g is negative definite. 

(ii) Every compact Lie algebra is the direct sum g = 3 + [g, g] where 3 
is the center of g and the ideal [g, g] is semisimple and compact. 


PROOF OF (1). If the Killing form is negative definite O(B) is compact and 
so are the groups Aut(g) and Int(g). On the other hand, if Int (g) is compact 


Lie Groups and Symmetric Spaces 5 


it leaves invariant a positive definite quadratic form Q ong. Let X,,..., X, 
be a basis of g such that 


Q(X) =), x? if K=YxiX, 


By means of this basis each o € Int (g) is given by an orthogonal matrix, so 
if X eg each ad X is skew-symmetric, that is, ‘(ad ¥) = —ad X, where 1 
denotes transpose. But then, 


B(X, X) = Tr(ad X ad X) = —Tr(ad X ‘(ad X)) 


=-Sx} if ad X=(x,) 
tJ 


This proves (1); the second part is proved similarly. 

Since the study of Lie algebras amounts to a study of the linear trans- 
formations ad X(X eq), the first problem is, of course, diagonalization. 
Here one gets further by working with C as the base field, so we make the 
following definition. 


Definition. Let g be a semisimple Lie algebra over C. A Cartan subalgebra 
of g is a subalgebra h such that (1) h < g is a maximal abelian subalgebra; and 
(2) for each H €b, ad H is a semisimple endomorphism of g (that is, it can 
be put into diagonal form by means of a suitable basis). 

The idea behind this definition is: If X,, X¥,€gq are such that ad X, 
and ad X, have simultaneous diagonalization then [ad X,, ad X,] =0 so 
[X,, X,]= 0; thus the set ad (h) is a maximal family of simultaneously diag- 
onalizable endomorphisms of g. Although our objective is the study of 
semisimple Lie algebras a over R the definition above is useful because the 
complexification g = a + ja is also semisimple. If g is any Lie algebra over 
C areal form of g is a real linear subspace b of g(thatis, re R, X¥, Yeb=>rX, 
X + Yeb) which is closed under the bracket operation and satisfies g = 
b + ib (direct sum). The mapping X¥ +i Y¥Y— X —iY(X, Yeb) 1s called 
the conjugation of g with respect to b. A Lie algebra g over C may have 
many real forms. 


Examples 


(i) sl(n, R) is a real form of sl(n, C). The diagonal matrices in slI(n, C) 
form a Cartan subalgebra. 
(ii) The Lie algebra su(1, 1) is a real form of sl(2,C). In fact, if 


to = e sl(2, C) 


Z21 222 


26 SIGURDUR HELGASON 


we can write (since z,, = —Z,,) 


(2 712) a (5 B, ) + (2 oa 
Z21 222 B, —ia, B, —ia, 
for a,,a,ER, B,, B, EC. 
(iii) The Lie algebra su(2) of skew-Hermitian matrices of trace 0, 


— 10 


x=(_5 i) aeR, Bec 


is obviously a real form of sI(2, C). Since the Killing form of a real form is 
in general obtained by restriction we see from (4) §3-1 that 


B(X, X) = 4 Trace (XX) = —8(a? + |B|?) 


so su(2) is a compact real form of sI(2, C). 
The following two results are of fundamental importance. 


Theorem 2.2. Every semisimple Lie algebra g over C contains a Cartan sub- 
algebra b. 


Theorem 2.3. Every semisimple Lie algebra g over C has a real form u which is 
compact. 

Ordinarily Theorem 2.2 is proved first using theorems on solvable Lie 
algebras (Lie’s theorem that a solvable Lie algebra of complex matrices has 
a common eigenvector). The simultaneous diagonalization of the endomor- 
phisms ad leads to a detailed structure theory for g by which the compact 
real form u is constructed. The details are as follows: 

Assume } is a Cartan subalgebra of g. Given a linear form « #0 on 
bh let 


qg* = {X E€g|ad H(X) = a(A)X for all H eb} 


This linear form « is called a root if g* # {0}. Let A denote the set of all 
roots. Then 


g=h+) 9% (direct sum) (1) 


aeA 


and it can be proved that 
dim g* = 1 (a € A) (2) 


Let h* denote the subset (real-linear subspace) of h, where all the roots have 
real values. Then for a suitable choice of vectors X, € g* the set 


u= ih* + Y R(X, — X_,) + ¥ R(i(X, + X-2)) (3) 
aeA aeA 


is a compact real form of g. 


Lie Groups and Symmetric Spaces 27 


Example 


Consider again the Lie algebra g = sI(n, C) and its Cartan subalgebra 
h of diagonal matrices of trace 0. Let again £;; denote the matrix 


(0a; On) Sa,b<n 
and for each H eb let e,(H) denote the ith diagonal element in H. Then 
[H, Ei] = (e(H) — efA))E;; 


for all H €} so the linear form a; (H) = e{H) — e,(H) is a root for i # j and 
by (1) this does give all the roots. The space h* consists of all real diagonal 
matrices of traceO. Let us put X,,,= £;;(¢#/). Then it is easily seen that 
the space (3) is the set su(m) of all skew-Hermitian n x n matrices, which is 
indeed a compact real form of sl (n, C) (cf. example above). 

It is tempting to try to prove Theorem 2.3 directly, because then 
Theorem 2.2 would be an immediate corollary. In fact, for each Yeu, 
ad X can be diagonalized, so if t c u is any maximal Abelian subalgebra, the 
space h = t + /t is a Cartan subalgebra of g. 

A direct and elementary proof of Theorem 2.3 (without the use of 
Theorem 2.2) does not seem to be available. However, Cartan has proposed 
an idea for this purpose (J. Math. Pures Appl. 8 (1929), p. 23), which I shall 
describe here. 

Since the Killing form of g is nondegenerate, there exists a basis e,,..., 
e, of g such that 


B(Z, Z) = —) 2; if Z=) ze; (4) 
it | 
Let the structural constants c;;, ¢C be determined by 


n 
[e;, e;]= » Ci ik &k 
1 


Then 
B(Z, Z) = Tr (ad Z ad Z) = y ( Cikh cm )2 Zj 
ij \hk 
so by (4) 
2, CikhC jnk = — Oi; (5) 
Also, 
B(LX;, X;], X,) + B(X;, LX;, X,]) = 0 
SO 
Cijk + Cy = 9 
and by (5) 


re 
» Cink = N 
i, fh, k 


28 SIGURDUR HELGASON 


The space 


u =) Re; 


| 


is a real form of g if and only if all the c;,;, are real. 
Consider now the set % of all bases (e,,..., e,) of g such that (4) holds. 
Consider the function f on & given by 


fei, eng en) = » Neil” 
i,j. 


Then we have seen that 


2 2 
>. C3 jxl Z y Cijk 
i,j,k i,j,k 


and the equality sign holds if and only if all the c;;, are real, that is, if and only 
if 


= 2 cin == (6) 


u=)> Re; 
| 

is a real form. In this case it is a compact real form in view of (4) and 
Prop. 2.1. 

Thus Theorem 2.3 follows if one can prove: (I) The function fon & has 
a minimum value; and (II) this minimum value is attained at a point (e,°, ..., 
e,°) € & for which the structural constants are real. Note that (II) is equiv- 
alent to (II‘): The minimum of / 1s 7. 


3-3 Cartan Decompositions 


We now go back to considering a semisimple Lie algebra g over R and 
as usual we denote by B the Killing form of g. There are of course many 
possible ways to find a direct vector space decomposition g = g* + g” such 
that B is positive definite on g* and negative definite on g_. However, we 
should like to find a decomposition which is directly related to the Lie 
algebra structure of g. 


Definition. A Cartan decomposition of g is a direct decomposition g ={+ yp 
such that (1) B <0 onf, B > 0 on p; and (ii) The mapping 0: T+ X ~T— X 
(Tet, X Ep) is an automorphism of g. 

In this case 0 is called a Cartan involution of g and the positive definite 
bilinear form (X, Y) > —B(X, 0 Y)is denoted by B,. We shall now establish 
the existence of Cartan decompositions, using compact real forms for semi- 
simple Lie algebras over C. 


Lie Groups and Symmetric Spaces 29 


Theorem 3.1. Suppose 0 is a Cartan involution of a semisimple Lie algebra g 
over R and o an arbitrary involutive automorphism of g. There then exists 
an automorphism @ of g such that the Cartan involution ¢6¢~' commutes 
with oa. 


PROOF. The product N = o@ is an automorphism of g and if X, Yeg, 


~B (NX, Y) = B(NX, OY) = B(X, N~'OY) = B(X, ONY) 
SO 
B(NX, Y) a B,(X, NY) 


that is, V is symmetric with respect to the positive definite bilinear form B,. 
Let X,,..., X, be a basis of g diagonalizing N. Then P = N? has a positive 
diagonal, say, with elements 4,,..., 4,. Take P’ (te R) with diagonal ele- 
ments 4,‘,..., 4,' and define the structural constants c; ;, by 


[X;, X J = 2 Cit X 


Since P is an automorphism, we conclude 
AA j Ci jn = An Ci jn 
which implies 
Ai Aj Cijn = Axi Cijn = (HE R) 


so P' is an automorphism. Put 0, = P'0P~'. Since 0ONO~' = N~', we have 
OPO-'=P~', that is 0P =P~'@. In matrix terms (using still the basis 


X,,..., X,) this means (since @ is symmetric with respect to B,) 
0,54, == 4, '0;; 

SO 
6;;4; = A; 0; ; 


that is, OP'9-' = P~'. Hence, 
o0, = oP'OP~' = o0P~*' = NP~*' 
6,6 =(00,)"' = P?‘N7' = N7'p?! 


p. 884). The following result is given in Mostow [54]. 


Corollary 3.2. Let g be a semisimple Lie algebra over R, g.=g + ig its 
complexification, 1 any compact real form of g., o and t the conjugations of 
§, with respect to g and u, respectively. Then there exists an automorphism @ 
of g. such that @ - u is invariant under o. 


30 SIGURDUR HELGASON 


PROOF. Let g.*% denote the Lie algebra g. considered as a Lie algebra over 
R, B® the Killing form. Itis not hard to show that B®(X, Y) = 2Re (B.CX, Y)) 
if B. is the Killing form of g.. Thus o and t are Cartan involutions of 9." 
and the corollary follows (note that since ot is a (complex) automorphism of 
g., @ IS one as well). 


Corollary 3.3. Each semisimple Lie algebra g over R has Cartan decomposi- 
tions and any two such are conjugate under an automorphism of g. 


PROOF. Let g. denote the complexification of g, o the corresponding conju- 
gation, and u a compact real form of g, invariant under o (Theorem 2.3 and 
Cor. 3.2). Then put f=gou, p=goi. Then B<0O on f, B>0 on 
p,andsinceO: T+ X ~>T— X(Tef, X € p)is an automorphism, B(f, p) = 0. 
It follows that g = f + p is a Cartan decomposition. 

Consider now two Cartan decompositions, 


g=f, +p, g=f,+p, 


Then u, =f, + ip, and u, =f, + ip, are compact real forms of g,. Let 
T, and t, denote the corresponding conjugations. By Cor. 3.2 there exists 
an automorphism @ of g, such that @-u, is invariant under t,. Thus 
@*\, 1s equal to the direct sum of its intersections with u, and ju,. Now 
B>Ooniu, andB<Oong-u,. Henceiu, on d-u, = {0} sou, = P< uy. 
But t, and t, both leave g invariant and @ can (according to the proof of 
Theorem 3.1) be taken as a power of T,T, So it also leaves g invariant. Thus 
—(G A Un) =G OU, So * gives the desired automorphism of g. 


Examples 


Let g = sl(n, R), the Lie algebra of the group SL(n, R). The group 
SO(n) of orthogonal matrices is a closed subgroup, hence a Lie subgroup, 
and by (8) §2-2, its Lie algebra, denoted so(n), consists of those matrices 
X €sl(n, R) for which exp tX¥ € SO(n) for all te R. But 


exp tX e SO(n)< exp tX exp t(‘X) = 1 det (exp tX) = 1 
SO 
so(n) = {X esl(n, R)| X +'X =0} 


the set of skew-symmetric n x n matrices (which are automatically of trace 0). 
The mapping 0: X > —'X is an automorphism of sI(n, R) and 07 = 1. 
Since B(X, X) = 2n Tr (XX), BCX, 0X) < 0so 0 is a Cartan involution and 


sI(n, R) = so(n) + p (1) 


where p is the set of m x n symmetric matrices of trace 0, is the corresponding 


Lie Groups and Symmetric Spaces 31 


Cartan decomposition. Now it is known that every positive definite matrix 
can be written uniquely e* (X¥ = symmetric) and every nonsingular matrix g 
can be written uniquely g = op (o = orthogonal, p = positive definite). Thus 
we have a global analog of (1), 


SL(n, R) = SO(n)P (2) 


where P = exp p, the set of positive definite matrices of determinant |. 
We shall now state a generalization of (2). 


Theorem 3.4. Let G be a connected semisimple Lie group with Lie algebra g. 
Let g = {+ p be a Cartan decomposition (f the algebra), K the analytic sub- 
group of G with Lie algebra f. Then the mapping 


(X, k) > (exp X)k 


is a diffeomorphism of p x K onto G. 

In Theorem 3.4, the center 3 of g is {0}, (immediate from the definition) 
so the center Z of G is discrete. One can prove Z c K and that K is compact 
if and only if Z is finite. In this case K is a maximal compact subgroup of G, 
and every compact subgroup is conjugate to a subgroup of K. 


Proposition 3.5. In terms of the notation of Theorem 3.4, the mapping 
(exp X)k > exp(—X)k (3) 


is an automorphism of G. 

In fact let G be the universal covering group of G. Since all simply 
connected Lie groups with the same Lie algebra are isomorphic (cf. (v) §2-2) 
the automorphism 0 of g induces an automorphism 06 of G such that d0, = 0. 
By the remarks above, the center Z of G is contained in the analytic subgroup 
K of G corresponding to f. But G=G/N, where Nc Z so 6 induces an 
automorphism of G which ts (3). 

Consider now the set G/K of left cosets gK (géG). This set has a 
unique manifold structure such that the map X > (exp X)K is a diffeomor- 
phism of p onto G/K. (More generally if K is a closed subgroup of a Lie 
group G, G/K is a manifold in a natural way.) The group G operates on 
G/K: each géG gives rise to a diffeomorphism t(g): xK >gxK of G/K. 
Since Z < K we have G/K = (G/Z)/(K/Z) and G/Z = Int (g) so the space G/K is 
independent of the choice of the Lie group G with Lie algebra g. In view of 
Cor. 3.3 the different possibilities for K are all conjugate so the space G/K is 
in a canonical way associated with g. Let o denote the point {K} in G/K 
(the origin) and (G/K), the tangent space. The mapping 7:g—gK has a 
differential dx mapping g onto (G/K), with a kernel which contains f. By 
reasons of dimensionality, we see therefore that the mapping 


dnz:p—+(G/K), (4) 


32 SIGURDUR HELGASON 
is an isomorphism and if k € K we have for Yep, teR 
n(exp Ad (k)tX) = n(k exp tX k~') = t(k)n(exp tx) 
SO 
dx (Ad (k)X) = dt(k) dn(X). (5) 


Now the form B is > 0 on p so by (4) and (5) we obtain a positive definite 
quadratic form Q, on (G/K), invariant under dt(k) (ke K). If peéG/K is 
arbitrary there exists a g € G such that p = gK and dt(g) : (G/K), > (G/K), is 
an isomorphism giving rise to a quadratic form Q, on (G/K),. If g’'éG 
satisfies g'K = gK, dt(g’) gives the same quadratic form Q, on (G/K), because 
of the K-invariance of Q,. Thus we have a Riemannian structure Q on G/K 
induced by B. 


Proposition 3.6. The manifold G/K with the Riemannian structure induced 
by B is a symmetric space. 


PrRooF. Let @ denote the automorphism (3) and s, the mapping gK > 
0(g)K of G/K onto itself. Then s, is a diffeomorphism and s,” = /, (ds,), = 
—I. To see that s, is an isometry let p = gK(geG)and Xe(G/K),. Then 
the vector X, = dt(g~')X belongs to (G/K),. But if xeG we have 


S(gxK) = O(gx)K = 1(0(g))(s,(xK)) 
$0 S$, 0 t(g) = 1(0(g)) oS, and therefore 
QO(ds,(X), ds,(X)) = QO(ds, o dt(g)(X,), ds, o dt(g)(X,)) 
= Q(dt(8(g)) o ds,(X,), dt(A(g)) o ds,(X,)) 
= O(X,, Y,) = Q(X, Y) 


Thus s, is an isometry and since (ds,), = —J/, it reverses the geodesics through 
o. The geodesic symmetry with respect to p = gK is given by 


p T(g) o So 0 t(g~') 
which is an isometry, so the proposition follows. 
Proposition 3.7. The geodesics through the origin in G/K are the curves 
t>aexptX:o(XeEp). 


Although the proof is not difficult we shall omit it. Instead let us take 
a second look at the example G = SU(1, 1). The decomposition 


(5 in) = (0 i) *(; 0) (6) 


Lie Groups and Symmetric Spaces 33 


gives a Cartan decomposition of su(1, 1). We have also if 


0 £B 
x= (5 0} 
exp (tX,) = cosh (t|B|)J + 7 sinh (t|B|)X, 


SO 


exp (tX,) 0 = (tanh t/f]) id 


[BI 


verifying the proposition in this case. 


3-4 Discussion of Symmetric Spaces 


We shall now summarize some basic results in the general theory of 
symmetric spaces and indicate how the coset spaces G/K from the last section 
fit into this general theory. 

Let M be a symmetric space as defined in Ch. 1. The group ((M) of 
all isometries of M is transitive on M. (In fact, if p, ge M they can be joined 
by a broken geodesic and the product of the symmetries in the midpoints of 
these geodesics gives the desired isometry.) One can now parametrize the 
group /(M) in a natural way turning it into a Lie group. The identity compo- 
nent G = J,(M) is still transitive on M. Fix a point o € M and let K be the 
group of elements in G which leaves o fixed. Then the mapping gK >g- o 
is a diffeomorphism of G/K onto M. If s, is the geodesic symmetry with 
respect to o the mapping o:g— 5, gs, iS an involutive automorphism of G 
and (K,), < K < K,, where K, is the set of fixed points of o and (K,), its 
identity component. In order to verify these inclusions let k € K. Then the 
maps k and s,ks, are isometries leaving o fixed and inducing the same linear 
map of the tangent space M,. Considering the geodesics starting at o we see 
that k and s,ks, must coincide so Kc K,. On the other hand, suppose 
X in the Lie algebra g of G is fixed under the differential (do),. Then s, 
exp 1X s, = exp ¢X for all t € R, so applying both sides to the point o we see 
that exp +X -o is fixed under s,. But o is an isolated fixed point of s, so 
exp 1X: o = 0 forall sufficiently small r. But then X ef, the Lie algebra of K, 
whence (K,), < K. Note finally that the group Ad, (K) is compact, being a 
continuous image of the compact groun K. 

Conversely, let G be a connected Lie group, K a closed subgroup, 
Ad,(K)compact. Suppose there exists an involutive automorphism oa of G 
such that (K,), ¢ K < K,. Then there exists a Riemannian structure on G/K 
invariant under G, and for every such Riemannian structure, G/K is a sym- 
metric space. 


34 SIGURDUR HELGASON 


Consider now M as above and G = /,(M); M is said to be of the non- 
compact type if G.is noncompact, semisimple without a compact normal 
subgroup # {e}, and of the compact type if G is compact and semisimple. 


Proposition 4.1. Let M be a symmetric space, which is simply connected. 
Then M is a product 


M=M,x M.~x M, 


where M, is a Euclidean space and M, and M, are symmetric spaces of the 
compact type and the noncompact type, respectively. 


Proposition 4.2. A symmetric space of the compact type (noncompact type) 
has sectional curvature everywhere > 0 (respectively < 0). 

There is a very interesting duality between the compact type and the 
noncompact type. Let M =G/K be a symmetric space of the noncompact 
type where G=J/,(M). Let g and f denote the Lie algebras of G and K, 
respectively. Let gq =f£+ p be the corresponding Cartan decomposition of 
g and g. = g + ig the complexification of g. Since [p, p] cf, the subspace 
u = {f+ ip of g, is actually a Lie algebra and another real form of g,.. Since 
the Killing form of g, is <0 on f, and >0 on p, it is <0 on u, so u isa 
compact real form. If U is a connected Lie group with Lie algebra u and K’ 
is the connected Lie subgroup with Lie algebra f, the space U/K’ is a sym- 
metric space of the compact type. This process can be reversed, that is, 
G/K can be constructed with U/K as a starting point. 


Examples 
(i) Consider the symmetric space G/K, where G = SU(1, 1) and K the 
subgroup of matrices (6 an lf] = 1. In this case the Cartan decomposi- 


tion (6) in §3-3 shows that u is the set of all matrices of the form 


(5 in) * (ip 4 


so u = $u(2), the algebra of all 2 x 2 skew symmetric matrices of trace 0. 
For the space U/K’ we can therefore take the space SU(2)/K. [SU(n) denotes 
the special unitary group.] It is not hard to show that when the unit sphere 
S? is projected stereographically onto the complex plane the rotations of the 
sphere correspond to the transformations 

az+b 


2 bl? =1 
77 lbz+a eee 


that is, to the members of SU(2). In this manner SU(2) acts transitively on 


Lie Groups and Symmetric Spaces 35 


S? and the subgroup leaving the point z = 0 fixed is K. Thus U/K = S? so 
the non-Euclidean disk D (Ch. 1) and the sphere S? correspond under the 
general duality indicated. The formulas g=f+p and u=f+ /p can be 
regarded as an explanation of the phenomenon that the triangle formulas in 
non-Euclidean trigonometry are obtained from the triangle formulas in 
spherical trigonometry by replacing the sides a, b, c by ia, ib, ic and using the 
relations sinh (ia) =isina, cosh(ia) =cosa. Lobatschevsky did indeed 
speak of his non-Euclidean trigonometry as spherical trigonometry on a 
sphere of imaginary radius. 

(ii) Let U be a connected, compact Lie group with Lie algebra u._ If 
Q is any positive definite quadratic form on u, we obtain by left translations 
such quadratic forms on each tangent space to U and therefore a Riemannian 
metric on U which 1s invariant under all left translations. If Q is chosen 
invariant under Ad (U) then the Riemannian metric 1s invariant under right 
translations as well. One can prove that the geodesics through e are the 
one-parameter subgroups and the symmetry s,:x—.x7~' is an isometry so 
U is a symmetric space. If U* denotes the diagonal in U x U one has a 
diffeomorphism (u,,u,)U* > u,u;' of (Ux U)/U* onto U. The group 
involution (u,, u,) > (u,, u,) of U x U leaves U* pointwise fixed and induces 
the symmetry s, of U, via the diffeomorphism indicated. 

If U is in addition semisimple, the symmetric space (U x U)/U* has in 
the above sense a noncompact dual G/U’, where U’ has Lie algebra u and the 
Lie algebra g of G is a certain real form of the complexification of the product 
algebra u x u. Onecan prove that as u runs through the compact semisimple 
Lie algebras, g runs through the complex semisimple Lie algebras (regarded 
as Lie algebras over R). 


3-5 The lwasawa Decomposition 


Let g be a semisimple Lie algebra, g = f + p a Cartan decomposition. 
The operators ad X¥ (X ep) are all symmetric with respect to the positive 
definite form B, and each of them can therefore be diagonalized, and a com- 
mutative family can be simultaneously diagonalized. Hence let a denote a 
maximal Abelian subspace of p and if @ is a real-valued linear function on a 
put 


g, = {X €g|L[H, X] = «(H)X for all H € a} (1) 


If g, # {0}, « £0, @ is called a restricted root. Clearly, if & denotes the set 
of restricted roots, 


g= 2, Se + Go (2) 


The dimension dim (g,) is called the multiplicity of «. Let a’ denote the set 
of elements in a, where all roots are #0. The connected components of a’ 


36 SIGURDUR HELGASON 


are intersections of half spaces; hence they are convex open sets. They are 
called Weyl chambers. Fix any Weyl chamber a* and call a restricted root 
positive if its values on a* are positive. 

Let &* denote the set of positive restricted roots and put 


1 ; 

n= ) G, pP=5 > (dim g,)z (3) 
a>O a>O 

Then n is a nilpotent Lie algebra. The following result is called the lwasawa 

decomposition. 


Theorem 5.1. 5 = {+a+ 1 (direct vector space sum). Let G be any connec- 
ted Lie group with Lie algebra g, and let K, A, N denote the analytic subgroups 
corresponding to f, a, and n, respectively. Then the mapping 


(k,a,n)—> kan 


is a diffeomorphism of K x A x N onto G. 
Rather than give the proof we consider some examples. Consider the 
Cartan decomposition (1) §3-3, 


sI(n, R) = so(n) +p (4) 


The diagonal matrices of trace 0 form a maximal Abelian subspace a of p 
and as in §3-2 we find that the corresponding restricted roots are the linear 
forms «,;(H) = e;(H) — e;(H) (H € a), e;(H) being the ith diagonal element 
in H. Hencea’ consists of those H for which all e;(H) are different. The set 


(H €ale,(H) > e(H) > --: > e,(H)} (5) 


is clearly a connected component of a’ and we take this as the Weyl chamber 
a*. Then 2* consists of the roots a;,; (i <j) and n is easily found to be the 
set of upper triangular matrices with 0 in the diagonal. An Iwasawa decom- 
position of the group SL(n, R) is therefore g = oan, where o € SO(n), a is a 
diagonal matrix of determinant | and diagonal > 0, and a is an upper tri- 
angular matrix with all diagonal elements 1. 


For another example consider the Cartan decomposition of su(1, 1) 


given by 
ix y\_ (ix 0 re O y 
y —ix) \O —ix y 0 


where x ER, yeC. As the space a we can take 


and since 


Lie Groups and Symmetric Spaces 37 


we see that the decomposition (2) equals 


i —i ii 01 
= R(; ~i) + RC; i) +(, 0} 


and the restricted roots are « and —«, where 


0 1 
a(; 9) 72 


Thus a’ consists of the nonzero elements in a and for a* we take for example 


SO 


n= R( 7 ) 
l —Il 
and N = expt equals the group of matrices 
1+in —in 
ea 1 — ) eSU, 1) 

The Iwasawa decomposition of a semisimple Lie algebra g involves 
some free choices, namely, that of f, a, anda*. We have seen that f is unique 
up to conjugacy, and now we shall see that a and a* are uniquely determined 


up to conjugacy by elements of K. We begin with a result which goes back 
to Weyl and Cartan with a proof given by Hunt [41]. 


Theorem 5.2. Let a and a’ be two maximal Abelian subspaces of p. Then 
there exists an element k € K such that Ad <(k) a=a’. Also 


p= ) Ad g(k) a 
kek 


PRooF. Select Hea such that its centralizer in p equals a. (It suffices to 
take H such that «(H) # 0 for all restricted roots a.) Put K* = Ad ,<(K) and 
let X ep be arbitrary. The function 


k* — B(H, k* + X) (k* € K*) 
has a minimum, say, for k* =k,. If Te we have therefore 
d 
— B(H, Ad (exp tT)ky - X) = 0 
dt t=0 
SO 


B(H,[T.k,:X])=0 Tet 


38 SIGURDUR HELGASON 


Thus 
B(T, [H, k,: X]) =0 for all Tef 


and since [H, ky: X]ef we deduce [H, ky: X] =O so by the choice of 
Hi ko: X Ea. 

In particular, there exists a k, e K such that H € Ad (k,)a’. Thus each 
element in Ad (k,)a’ commutes with H so Ad (k,)a’ ca. This proves the 
theorem. 


3-6 The Weyl Group 


Let g be a semisimple Lie algebra, g = f + p a Cartan decomposition, G 
any connected Lie group with Lie algebra g, K the analytic subgroup with 
Lie algebra [< g. Consider as before a maximal Abelian subspace ac p 
and let M’ and M denote, respectively, the normalizer and centralizer of 
a in K; that is, 


M' = {ke K|Ad(k)aca} 
M = {k € K|Ad (k)H = A for all H € a} 


Clearly M is a normal subgroup of M’ and the factor group M’/M can 
obviously be viewed as a group of linear transformations of a. It is called 
the Weyl group and denoted W. In view of Theorem 5.2 it is (up to isomor- 
phism) independent of the choice of a. 

Now M and M’ are Lie subgroups of K and their Lie algebras 11 and 
in’ are given by (cf. (8) §2-2, (7) §3-1), 


m= {Te f|[H, T] = 0 for all H ea} 
nv’ = {Tef|[H, T] ca for all H €a} 
Note, however, that if Tein’ then for H €a, 
B((H, T], LH, T]) = — B(LH, [H, T]], T) = 90 


so Tem, whence m=m’. Thus M’/M is a discrete group and being also 
compact, must be finite. 

If 2 is a complex-valued linear function on a let H, denote the vector 
in a+ia determined by B(H, H,) = A(M) for all Hea. For we let s, 
denote the symmetry in the hyperplane «(/) = 0: 


a(H) 


a) 


s,(H) = H —2 F H €a, (1) 
(Remember p and hence a have a Euclidean metric given by B.) 


Theorem 6.1. s,& W for each aeéX. 


Lie Groups and Symmetric Spaces 39 


PROOF. Pick Z,ég such that [H, Z,]=a(H)Z,. Decomposing Z, = 
T,+X, (T,¢, X,¢€p) the relations [f, p] cp, [p, p] cf imply that 
(ad H)*T, =T,. Multiplying Z, by a real factor if necessary we may assume 
B(T,, T,) = —1. Now if «(1) = 0 we have [H, 7,] = 0 so 


Ad (exp tT,)H = e°“'=(H) = H if «(H) = 0 
A simple computation shows that 
ed (oT a) Ey = —H 


provided 1,(a(H,))'/? =x. Thus s, coincides with the restriction of 
Ad (exp t, T,,) to a. 

Ifse Wand « e Lit is clear from the definitions that the linear function 
a’: H—«(s-'H) on a is a restricted root. Consequently, s permutes the 
Weyl chambers. Now let C, and C, be two Weyl chambers and let H, € C,, 
H,éC,. If the segment HA, intersects a hyperplane «(H) =0 (ae 2X) 
then clearly the norm |_| in a satisfies 


|H, — H,| > |H; — s,H2| (2) 


As s runs through the finite group W the function |H, — sH,| takes a mini- 
mum, say for s=5S,). By (2) the segment from H, to s,H, intersects no 
hyperplane a(H) = 0 (a €2) so H, and s, H, lie in the same Weyl chamber 
and thus C, = s,C,. This proves: 


Corollary 6.2. Any two Weyl chambers in a are conjugate under some 
element of Ad ¢(K) which leaves a invariant. | 

For orientation we state without proof a somewhat deeper result on 
the Weyl group. 


Theorem 6.3. The Weyl group W is generated by the symmetries s, (a € x) 
and it is simply transitive on the set of Weyl chambers in a. 


3.7 Boundary and Polar Coordinates on the Symmetric 
Space G/K 


For the non-Euclidean disk D we have a natural notion of boundary, 
namely, the unit circle jz} = 1. However, this boundary notion refers to 
the position of Din R?. In order to make this definition more intrinsic we 
can define the boundary of D as the set of all rays (half-lines) from the origin 
in D. This motivates the following definition of the boundary of the sym- 
metric space G/K. First, we recall the isomorphism dz : p > (G/K), from §3-3, 
which permits us to think of p as the tangent space to G/K at o. Then we 
understand by a Weyl chamber in p a Weyl chamber in some maximal Abelian 


40 SIGURDUR HELGASON 


subspace pf p. The boundary of G/K is now defined as the set of all Weyl 
chambers in p. Now fix ac p and a* a Weyl chamber ina. Then accord- 
ing to Theorem 5.2 and Cor. 6.2, Ad (k)a* (k € K) runs through the boundary 
and if Ad(k)a* = at, then k € M’ so Ad(k) ona is a member of the Weyl 
group. Using Theorem 6.3 we see thatke M. Thus the mapping 


kM > Ad (k)a* 


identifies K/M with the boundary of G/K. In view of the Iwasawa decompo- 
sition G = KAN and the fact that M normalizes AN we have a diffeomorphism 


kM +> kMAN 


of K/M onto G/MAN. In his paper [19], Furstenberg defines a boundary 
of G to be a compact coset space G/H of G such that for each probability 
measure p on G/H there exists a sequence (g,) < G such that the transformed 
measures g, ‘ 4 converge weakly to the delta function on G/H. It was proved 
by Furstenberg [19] and Moore [53] that a “‘ maximal’”’ boundary of this sort 
is given by G/MAN which, as we saw, coincides with the geometrically defined 
boundary above. The relation K/M = G/MAWN shows in particular that G 
acts as a transformation group on the boundary; in an explicit manner 


g(kM) = k(gk)M 


if for x e G, k(x) € K is given by x E k(x)AN. 
Now let A* =expa*. Then we have the following ‘‘ polar coordinate 
representation’’ of the symmetric space G/K. 


Theorem 7.1. The mapping (kM, a)—kaK 1s a diffeomorphism of K/M 
x A* onto an open submanifold of G/K whose complement in G/K has lower 
dimension. 

Without spelling out the proof in detail we remark that it is a fairly 
direct consequence of Theorems 3.4, 5.2, and 6.3. 


CHAPTER 4: FUNCTIONS ON SYMMETRIC SPACES 


4-1. Invariant Differential Operators 


Let M be a manifold and D a differential operator on M, that is, a 
linear mapping of C,~(M) into itself which in an arbitrary coordinate system 
is expressed by partial derivatives in the coordinates. Let 6: M—>M bea 
diffeomorphism, and if f is a function on M put f*=f.@~* and let D® 
denote the operator 


D*f = (Df? ’)* 


Lie Groups and Symmetric Spaces 41 


Then D® is another differential operator, and we say D is invariant under > 
if D® = D. 


Examples 


Let us find all differential operators D on R" which are invariant under 
all rigid motions. Since D is invariant under all translations it has constant 
coefficients so D = P(Cc/Cx,, ..., 6/6x,), where P is a polynomial. But D is 
also invariant under all rotations around 0 so P is rotation-invariant, and 
since the rotations are transitive on each sphere |x| =r, we find P is constant 
on each such sphere so P(x,,..., X,) is a function of x,? +--+: + x,?, hence 
a polynomial in x,7 +:*: + x,7. 


Proposition 1.1. The differential operators on R" which are invariant under 
all isometries are the operators La, A" (a, €C), where A is the Laplacian. 

This result holds also if R" is replaced by a symmetric space of rank | 
(and A by the Laplace-Beltrami operator) and also if we replace the isometries 
of R" by the inhomogeneous Lorentz group, in which case the Laplacian is 
replaced (cf. [29], p. 271) by the operator 


Ce 


Now if M is a Riemannian manifold the Laplace—Beltrami operator 
A on M is invariant under all isometries of M. The examples above have a 
high degree of mobility, that is, a large group of isometries, so essentially 
only A is invariant. The following interesting generalization is essentially 
a combination of results of Harish-Chandra and Chevalley (see [31] p. 432). 
It expresses in a precise way how higher rank of the space, that is, lower 
degree of mobility, leads to more invariant operators. 


Theorem 1.2. Let G/K be a symmetric space of rank /. Then the algebra of 
all G-invariant differential operators on G/K is a commutative algebra with 
/ algebraically independent generators. 

It will now be convenient to assume that G has finite center so K is 
compact. As pointed out in §3-3, this is no restriction on the symmetric 
space G/K. Let L(g) and R(g) denote left and right translations on G by 
the group element g and let D(G) denote the set of all differential operators 
on G invariant under all L(g). If X €q the operator 


& : F(g) > {(d/dt)F(g exp 1X)},=0 
belongs to D(G). Let D,(G) denote the set of elements in D(G) which are 
invariant under all R(k) (ke K). For Dé D(G) we put 


Di = I DRY dk (1) 


42 SIGURDUR HELGASON 


where dk denotes the normalized Haar measure on K. The integral makes 
sense since all the operators D®? (k € K) belong to a fixed finite-dimensional 
vector space, so D* is a differential operator on G. Clearly D' € Dx(G), 
and we have 


(D*F)\(e) = (DF )(e) (2) 


for every Fe C%(G) which is bi-invariant under K (that is, F(k,gk,) = 
F(g),g€G,k,,k,¢€K). In fact, 


(D*F\(e) = | (DPOF) dk = | (DFR Re) dk 
= | (DF\(k7") dk = | (DF) (e) dk 
K K 


[| CPO dk = (DF\(e) 


Let x denote the natural projection g >gK of G onto G/K; if fis a 
function on G/K we put f = fom. Then the mapping f—/ is an isomorphism 
of C%(G/K) onto the space C,~(G) of functions Fe C%(G) satisfying 
F(gk) = F(g). Similarly, we would like to “‘lift”’ the operators in D(G/K) 
to the group G. If DeD,(G) let x(D) denote the operator on C%(G/K) 
determined by (x(D)f)~ = Df (fe C%(G/K)). It is easy to see (cf. [31], 
p. 390) that the map D-— x(D) maps D,(G) onto D(G/K). 
| As before let t(g) denote the diffeomorphism 4K > ghK of G/K onto 
itself. We shall often denote the symmetric space G/K by X. 


4-2 Harmonic Functions on Symmetric Spaces 


In view of Prop. 1.1 it is natural to make the following definition. 


Definition. A function ue C%(G/K) is called harmonic if Du=0 for all 
DeéD(G/K) which annihilate the constants (that is, “‘ without constant 
term’’). 

Godement made this definition in [22] (even for nonsymmetric spaces 
G/K), where he proved also the mean value theorem below. 


Theorem 2.1. A function u € C~(G/K) is harmonic if and only if 
u(gkh:o)dk=u(g-0)  forallg,heG (1) 
K 
This result is most easily interpreted if rank (G/K)=1. Then the 


orbit K :-(h- 0) is a sphere and gK - (h- 0) isa sphere with centerg:o. Thus 
the theorem states in this case that u is harmonic if and only if the mean value 


Lie Groups and Symmetric Spaces 43 


of u over an arbitrary sphere is equal to the value of u in the center (cf. Gauss’ 
mean value theorem for harmonic functions in R"). 


PROOF. Suppose first that uw is harmonic and for a fixed g € G consider the 
function 


F:h— | aigkh) dk (heG) 
K 


Let D be an operator in D(G) annihilating the constants. Then using (2) 
in §4-1, 


(DF \(e) = (D*F\(e) = [o'( [a at)}, 


=e 


which by the left invariance of D® equals 
| (D*a\(@k) dk = (D*ayg) 


(the last relation coming from the right invariance of D'# under K). However, 
(D*ii) = (n(D*)u)~ = 0 since n(D*) annihilates the constants. Thus(DF)(e) = 
0 for all De D(G) which annihilate the constants. 

Since wu satisfies the elliptic equation Au = 0 and since A has analytic 
coefficients, it follows from a theorem of Bernstein (John [44], p. 142) that u 
is also analytic. Hence d@ and F are also analytic so from Taylor’s formula 
(§2-2) we can conclude that Fis constant. But the relation F(/:) = F(e) is (1). 

On the other hand, suppose (1) holds. Let De D(G/K) annihilate the 
constants. Writing (1) as 


leaner 1) dk=u(g:0) g€G,xExX 
K 
we deduce by applying D to both sides (considered as functions of x), 
| (Du)(gk-x) dk =0 
K 
Taking x = 0 we conclude Du = 0, so u is harmonic. 
Now we intend to study bounded harmonic functions u on the symmetric 
space G/K and prove a Poisson integral representation formula due to Furs- 
tenberg [19]. Let Q, denote the set of all functions wy € L(G) (the space of 


bounded measurable functions on G) such that the sup norm ||W||,_, = sup |w(/)| 
satisfies ||y|| , < |lul|,, and such that heG 


u(g:o0)= | vigkh) dk forallg,heG 


According to Godement’s theorem # € Q,, so Q, is not empty. In addition 


44 SIGURDUR HELGASON 


it is a convex set and closed in the weak* topology of L(G) (the weakest 
topology for which all the maps f > J f(g)W(g) dg of L(G) into C are con- 
tinuous, f being an integrable function on G and dg being a Haar measure). 
Since the unit ball in L*(G) is compact in the weak* topology (see, for exam- 
ple, [50]) it follows that QO, is compact. Now if we O, we have p* € Q,, 
for all g€ G so G acts as a transformation group of Q, by right translations. 
We would like to find a fixed point under the sugbroup MAN, which then 
would give us a function on the boundary G/MAN. 


Definition. A group has the fixed point property if whenever it acts contin- 
uously on a locally convex topological vector space by linear transformations 
leaving a compact convex set Q 4 @ invariant it has a fixed point in the set. 


Lemma 2.2. Connected solvable Lie groups have the fixed point property 
(cf. [6], p. 115). 


PROOF. Let V be a locally convex topological vector space and G any 
Abelian group of linear transformations of V. For each ge@ let 
Jn = (I/n\(1 +g +++: +g"~'); let G denote the set of all products g,, .-. Gn, 
(n,;eZ*,geG). All elements of G commute. Let Qc V be a nonempty 
compact convex subset of V. By convexity, 4Q c OforheG. Leth,,..., 
h.eG. Then for eachi, 1 <i<r, 


h, . AQg= h;h, coe hye iat oe hQc h;Q 
whence 


h,...h,Qc (\h,Q 
i= 1 


so this intersection is #4 @. By compactness of Q (expressed by the finite 
intersection property), we have 


fre e © 


Let x an element in this intersection and let géG. Then x€g,Q, so fora 
suitable element y € Q, 


n-1 


I 
NOE Ores gy) 
SO 


1 1 
gx —-x =-(g"y— y)—-(Q+(-Q)) 
n n 


for eachn. Using again the compactness of Q we conclude g - x = x. 


Lie Groups and Symmetric Spaces 45 


Now assume G is a connected solvable Lie group of linear transform- 
ations of V. Let g be its Lie algebra and let 


§=90>91>°°°>Gm={0} Gm 1 F {0} 

be the sequence of derived algebras, g; = D'g. Let G=G)>G,>°°: > 
G, = {e} be the corresponding series of analytic subgroups of G. Suppose 
now the lemma holds for all connected solvable Lie groups whose series (as 
defined above) has length < m. Let A denote the set of points in Q fixed under 
all géG,. By the induction assumption, A is # @ and, of course, A is 
convex and compact. Let yeG. If geG, then ygy'EG,, so if xe A, 
vgy 'x=x so gy 'x=y ‘x. Thus y~'x is fixed by all elements in G,; 
being in Q, y ‘x belongs to 4. Thus G maps 4 into itself. The closed sub- 
space V, of V generated by 4 is locally convex and since G, acts trivially on 
it, G acts on V, as an Abelian group. By the first part of the proof there 
exists ave A fixed under allgeG. Q.E.D. 


Lemma 2.3. The group MAN has the fixed point property. 


PROOF. Let MAN act on a locally convex space V and let OcV bea 
compact convex subset #4 @ invariant under MAN. Since AN is solvable 
and connected there exists a point g € Q fixed under AN. _ If dm denotes the 
normalized Haar measure on the compact group M the integral 


[mq dm 


(defined by means of approximating sums) represents, because of the com- 
pactness and convexity, a point g* in Q. Since m(AN)m™' < AN we have 
forse AN 


sq* = Jom ‘gqdm= [mtn 'sm)q dm = [m -qdm 


so q* is fixed under MAN. 

We recall now that the boundary B of the symmetric space is given by 
the coset space representations B= K/M, B=G/MAN. The latter shows 
that G acts on B; this action will be denoted (g, b) > g(b) in order to distin- 
guish it from the action (g, x) >~g:x of G on X =G/K, which we have 
already used. Let db denote the unique K-invariant measure on B satisfying 


[ db=1 
B 


Theorem 2.4. If u is a bounded harmonic function on YX then there exists a 
bounded measurable function & on B such that 


u(g-0) = | @(g(b)) db (2) 


46 SIGURDUR HELGASON 


On the other hand, if @ is a bounded measurable function on B then u as 
defined by (2) is a bounded harmonic function on X. 


ProoF. As shown above (Lemma 2.3) the set Q, has a fixed point under 
MAN, say u,. Define & on G/MAN by &i(gMAN)=1u,(g). Then by the 
definition of Q,, we have 


u(g-0) = | a(gkhMAN) dk 


Take h = e and recall that gk MAN 1s g(b) if b=kM. Then (2) follows 
because if F is any continuous function on B, 


| F(b) db = i F(kM) dk 
B K 
On the other hand, if @ is a function in L“(B), define u by (2). Then 
u(gkh-o) = { fi(gkh(b)) db (3) 
B 


Now let b = k’MAN;; then gkh(b) = gkhk’ MAN = gkk, MAN if hk' = k,a,n, 
(Theorem 5.1, Ch. 3). Hence, 


[ wagkk -0) dk = f( [ agkhk’MAN) aK’ dk 
= [( { agkhk’MAN) ak) dk! = fi | a(gkk,MAN) dk) dk’ 


= f (J, agkman) ak) dk’ =  a(gkMAN) dk = u(g°-o). 


By Theorem 2.1, uw is harmonic, so the theorem is proved. 
Now define the Poisson kernel P(x, b) on the product space X x B by 
the Jacobian 


_ dQg~*(b) 
db 
As we saw in Ch. |. (11) §1-3 this does indeed give the classical Poisson 


kernel in the case when G/K is the non-Euclidean disk. We shall give the 
general formula for (4) later. But at any rate formula (2) can be written 


P(g -o, b) (4) 


u(x) = J PO b)Q(b) db (5) 


giving a Poisson integral representation of an arbitrary bounded harmonic 
functionon X. Furstenberg showed in [19], p. 366, that in the weak topology 
of measures the values of & can be regarded as boundary values of u. We 


Lie Groups and Symmetric Spaces 47 


shall now see that this is also the case, when we approach the boundary in a 
more geometric fashion. 
Let n denote the subalgebra of g given by 


n=) g, 


a<QO 


where the g, are given by (2) §3-5. Let N denote the corresponding analytic 
subgroup of G. As an immediate consequence of the Bruhat lemma (see 
Harish—Chandra [26]) we have that the subset NMAN c G isan open subset 
whose complement has lower dimension. As a result the mapping T: n> 
k(n)M maps N onto a subset of K/M whose complement has lower dimension 
[Here k(n) is the K-component of n according to the decomposition G = KAN.] 
One can also prove that the mapping T is one-to-one. 


Lemma 2.5. For a certain positive integrable function w on N, we have 


Jy AEM) dw = | S(KGM A) dit fe C*(K/M) 


Here dk, is the normalized K-invariant measure on K/M and dn is a Haar 
measure on N. 


PROooF. Let dk, o T denote the measure on N given by 
(dkyyo TC) = | dky  C compact in N 
T(C) 


Let (n) denote the Radon-Nikodym derivative (see, for example, [24], 
p. 128). Then the lemma follows at once from the properties of T given above. 


REMARK. This lemma is given in Harish-Chandra [27], p. 287, with an ex- 
plicit formula for w(n) which will be derived later (Proposition 2.10). 

The mapping T is particularly useful for studying the action of A on 
the boundary. In fact, ifaeA,neN we have 


a(k(n)M) = ak(n)MAN = k(an)MAN = k(ana~')MAN 
that is, 
a(k(n)M) = k(n")M (6) 
the superscript denoting conjugation. 


Theorem 2.6. Let F be a continuous function on B and u its Poisson integral 


u(x) = [ PO. b)F(b)db  xeEX 


48 SIGURDUR HELGASON 


Then wu has boundary values given by F, that 1s, 
lim u(k exp tH - 0) = F(kM) (7) 


t7~ ao 


for each k e K and each Hea’. 


PROOF. We may assume k =e. We must prove that if a, = exp tH then as 
t{— © 


| F(a,kM)) dky > F(eM) 
K/M 
But by Lemma 2.5 and (6) the integral on the left equals 


[ Fla(k@M)W() dn = | FRG MWMA) di (8) 


Now 


where X, €g, and by (7) and (9) in §3-1, 


nex? 4 — exp H exp (> x, exp (—H) = exp (ad (exp ap) x,)) 


=n (= a0(Be 


But «(H) <0 whenever « <0 so we see that foreachne N,n*?'™ +e. It 
follows (using the dominated convergence theorem) that the right-hand side 
of (8) has a limit 


| F(eM)(i) dit = F(eM) 
N 
as too. This proves the theorem. 
The result above is not new (cf. Karpelevic [46], Theorem 18.3.2 and 


also Moore [53], p. 204). Next we prove that the boundary function @ in 
Theorem 2.4 is unique. 


Corollary 2.7. Let Fe L“™(B) and 
u(x) = i P(x, b)F(b) db (x X) 
B 


Then if w= 0, we have also F = 0. 
In fact, let ¢ € L'(G) be continuous and consider the function 


F,(b) = | $@)FQ@(b)) dg be B 


Lie Groups and Symmetric Spaces 49 


The function F, is continuous (as a convolution of a continuous integrable 
function with a bounded function) and its Poisson integral u, is given by 


u,(h-0) = [ Piro, b)F ,(b) db = [ F(h(b)) db 


= | (J, @@Ftah(by dg) db = f dau(gh-o) dg 


Now if w=0 we have u, = 0 so by Theorem 2.6, F, =0. But since @ is 
arbitrary, we conclude F = 0. 


The Topology of X UB 


It is possible to define a topology on the union XY U B such that the limit 
relation (7) is convergence in this topology. A vector Y € p is called regular 
if its centralizer Zy in p is Abelian. A point x = (exp Y)K in X is called 
regular if Y is regular. Now a regular vector Yep belongs to a unique 
Weyl chamber by in the maximal Abelian subspace Zy. We say that a 
sequence of points x,, x,,... in X converges to a boundary point 6 if 


(i) Each x, = (exp Y,)K (where Y, € p) is regular 
(ii) The Weyl chambers by, converge to b (in the topology of B) 
(iii) The distance from Y, to the boundary of by in Zy, tends to oo 


It is not hard to verify that this convergence concept (together with the 
usual convergence definition on X itself) defines a topology on the union 


XU B. 
We shall now prove some measure-theoretic results due to Harish— 


Chandra ([25], p. 239, [27], p. 294) and give an explicit formula for the 
Poisson kernel P(x, 5) as a consequence (cf. also Schiffmann [56]). 


Lemma 2.8. Let dk, da, and dn be left invariant Haar measures on the groups 
K, A, and N, respectively. Then for a suitable normalization of the Haar 
measure dg of G, we have 


[ fdg=| ——_ f(kan)e?""®) dk da dn 
G KxXxAXN 


for all fe C.”(G). This p is defined in §3-5 and log denotes the inverse of 
the mapping exp: a— A. 


PROOF. Since the mapping (k, a, n) > kan is a diffeomorphism of K x A x N 
onto G (§3-5) there exists a function D(k, a, n) on K x A x N such that 


I (g) dg = | I (kan) D(k, a, n) dk da dn (9) 
G Kx AXN 


50 SIGURDUR HELGASON 


for all fe C.*(G). The groups G, K, A, N are all unimodular, that is, the 
left invariant Haar measures are all right invariant. Thus the left-hand side 
of (9) does not change if we replace f(g) by f(k,gn,), k, 6K, n,eN. It fol- 
lows that D(k;'k, a, nnj') = D(k, a, n) so D(k, a, n) is a function 6(a) of 
aalone. Leta,¢éA. Then 


[ $0) dg = f flga,)dg=f  f(kana,)8(a) dk da dn 

G G KAN 
~ | f(kaa,(az'na,))6(a) dk da dn 
= { f(ka(az !na,))6(aaz") dk da dn 


= | f(kan)6(aaz*)J(a,, n) dk da dn 
KAN 
where J(a,, 6) denotes the Jacobian determinant of the mapping n > a,na;? 
of N onto N. The computation in the proof of Theorem 2.6 shows that 
J(a,,n) = eZ P(log a1) 
Thus 
6(a) = 5(aa,; ‘ete 1) 


and the lemma follows. 
Given geéG, let k(g)€ K, H(g)eéa, n(g)eN be determined by g= 


k(g) exp H(g)n(9). 
Corollary 2.9. The Poisson kernel on G/K x K/M is given by 
P(gK, kM) = e7 2°¢H a7") 


PROOF. The mapping k — k(gk) is a diffeomorphism of K onto itself. Now 
fixheG. Then for fe C,*(G), 


| f(kanye?8 dk da dn = | f(g) dg = | f(hg) ag (10) 


Now if g = kan, then 
hg = hkan = k(hk) exp H(hk)n(hk)an = k(hk) exp H(hk)a(a™! n(hk)an) 
which we write as k,a,n,. Then our integral on the right-hand side of (10) 


equals 


| f(kyayn,)e28) dk da dn. (11) 


Lie Groups and Symmetric Spaces 51 


But the map a— exp A(hk)a preserves the measure da and the map n> 
(a~*n(hk)a)n preserves the measure dn. The integral (11) therefore equals 


[ F(k(ak)a nels De 20H day di 
so comparing with the left-hand side of (10), we find 
| F(k) dk = | F(k(hk))e~2°?°#) dk (Fe C®(K)) (12) 
K K 


In particular, let us use this for F(k) = (kM), @ being an arbitrary C® 
function on the boundary. Since 


x dk = Vie) dk 


| F(k(hk))e7 2°20) dik = { b(k(hk)M)e~ 2°00) dk 
K K/M 


and since k(hk)M = h(kM) the corollary follows from (12). 
As another application let us compute the function n—-w(n) in 


Lemma 2.5. 


Proposition 2.10. For a suitable Haar measure di on N we have 


| f (ku) dk = | f(k()M)e~2#™) di © C®(K/M). 
K/M N 


Proor. Fix an element #7,¢N and consider the function f"%:kM— 
T(ig(kKM)) on K/M. Since tig (k(n)M) = k(iign)M we conclude from Lemma 
2D, 


[ SG(KM)) dky = [_f(kG, AMA) di = | (KAM Wa ti) dit 
K/M N N 
and from the definition of the Poisson kernel, 
| SGCKM)) dky = [_ f(KM)P(H,: 0, KM) dky 
K/M K/M 

= i f(K(@)M)P(ii,- 0, k(A)M)W(n) di 
Comparing the formulas we conclude, 


W(i, *) = P(A, 0, k(n)M)W(n) 


SO putting n = e the proposition follows from Cor. 2.9. 
To conclude this section we state two theorems without proof. Let A 


denote the Laplace—-Beltrami operator on X. 


52 SIGURDUR HELGASON 


Theorem 2.11. Let u be a bounded solution of the equation Au =0 on X. 
Then u is harmonic. 

A probabilistic proof of this theorem is given in Furstenberg [19] 
(cf. also Berezin [2] and Karpelevié [46)). 

Using this result, A. Kordnyi and the author ([38]) have proved the 
following theorem which generalizes the classical Fatou theorem for the unit 


disk. 


Theorem 2.12. Let u be a bounded solution of the equation Au=0 on YX. 
Then for almost all geodesics t > y(¢) in X starting at the origin o the limit 
lim u(y(t)) 
to 


exists. 


4-3 Spherical Functions on Symmetric Spaces 


Let X = G/K be a symmetric space of the noncompact type as in the 
last section. A spherical function on G/K is by definition a K-invariant eigen- 
function ¢ of all the operators D € D(G/K) satisfying ¢(0) = 1. According 
to a theorem of Harish-Chandra the spherical functions are precisely the 
functions on G/K given by 


d,(gK) =n jeseey dk (1) 


where 4 is an arbitrary complex-valued linear function on a. 

In the simplest case when X is the non-Euclidean disk D from Ch. 1 
the spherical functions are the Legendre functions P, and their integral 
formula 


1 2n 
P,(cosh r) = a | (cosh r + sinh r cos 0)” d@ 
TO 


is the simplest example of (1) (see, for example, [31], p. 406). 

We shall now state Harish-Chandra’s result ([27], p. 612, [28], p. 48) 
which describes how an arbitrary K-invariant function fe C.~(X) can be 
decomposed into spherical functions. In view of Theorem 7.1, Ch. 3 such a 
function f is completely determined by the values f(a: 0), (ae A*) and we 
define the transform (spherical Fourier transform) f(A) by 


FA=J} f(a-0$a\D(a) da (Ae a*) (2) 


Here a* is the dual of the vector space a and the function D(a) is the density 
for the volume element dx on X in polar coordinates (Theorem 7.1, Ch. 3). 
More precisely, if x = ka: o then dx = D(a) dky da. 


Lie Groups and Symmetric Spaces 53 


The problem is now to invert formula (2). Motivated by the spectral 
theory of singular ordinary differential operators, Harish-Chandra expands 
the function @, (exp H) in a series of the form 


$,(exp H) = (  va(sapeis” erm (Hea) (3) 


Here yp runs through certain subset of a*, the y, are certain functions on 
a* and W denotes the Weyl group (which acts on a* by duality). The dom- 
inating term in this series has the form 
eS” (sds (4) 
sew 
where 1/e(A) is a certain analytic function on a*. From (1) above and Prop. 
2.10, Harish-Chandra derives the integral formula 


c(A) = | ef —4-eH@)) dap (5) 
N 
whenever the integral converges absolutely. 


Theorem 3.1. The inverse of the spherical Fourier transform f/f in (2) is 
given by 


f(a-o)= | Fda) le(ay\-? aad (6) 


where dA is a constant multiple of the Euclidean measure on a*. 

The simplest case of this theorem is the inversion formula for the Mehler 
transform stated in Ch. 1. 

We shall now attempt to describe some of the main steps in the proof 
of this theorem. For a restricted root a > 0 let m, = dim (g,), where g, is as 
defined in §3.5. Let (,) denote the inner product on a* induced by the 
Killing form B of g, restricted to a. 

(i) The function c(A) is given by c(A) = I(iA)/I(p), where 


; ra 0) P 
1) = [1 85 masz Maat =) weal) (1) 
and B denotes the Beta function, 
FC eae 


T(x + y) 


Let us first consider the case rank (G/K) = 1. Then @,(a) is a function 
of one real variable and is characterized by a single second-order ordinary 
differential equation (which comes from ¢, being an eigenfunction of A). 
One finds then that ¢, is given by a hypergeometric function. If one now 


54 SIGURDUR HELGASON 


compares the series expansion for the hypergeometric function with the 
expansion (3), formula (7) follows. For the details see Harish-Chandra 
[27], p. 301. 

Bhanu-Murthy [4, 5] extended (7) to several other special cases where- 
upon Gindikin and Karpelevié [21] proved (7) in general along the following 
lines. Let « >0 be a restricted root which is not a positive integral multiple 
of other restricted roots. Let g* denote the subalgebra of g generated by 
g, and g_,. Then g* is semisimple and has a Cartan decomposition 


g=P+p* P=g*nt p*=g*np (8) 


Let G* and K* denote the analytic subgroups of G corresponding to g* and 
f*, respectively. The symmetric space G*/K* (which can be identified with 
the orbit G* - o and is a totally geodesic submanifold of G/K) has rank one. 
In fact if a* denotes the orthogonal complement in a of the hyperplane 
a(H) = 0 then a’ is maximal Abelian in p*. Now G* has an Iwasawa decom- 
position G* = K*A*N*%, and the c-function for G*/K*, denoted c’, is given by an 
integral of the form (5) over the group N*. Now Gindikin and Karpelevié 
prove that the product of these integrals (for the various «) is equal to the 
integral (5) over N; more precisely, 


c(A) = i c*(4") (9) 


where 4* denotes the restriction of A to a* and « runs through the restricted 
roots specified above. Now (7) follows from the rank-one case. 

Now let W(a*) denote the set of rapidly decreasing functions on a* in 
the sense of Schwartz [57] and let 4%(a*) denote the set of W-invariant func- 
tions in W(a*). (Here W is the Weyl group.) 

(ii) Let pe a*. Then the mapping 


S,:b- f, #0) j b(Abx(a) lea)? ai) D(a da 


[b € #(a*)] is a tempered distribution on a*. 

It is easy to see from (7) that the integral over A is absolutely convergent. 
On the other hand to show that the integral with respect to a is absolutely 
convergent and makes S, a distribution requires very detailed study of the 
behavior of @,(a) for large a (see Harish-Chandra [27], p. 588). 

Eventually one wants to prove that for a suitable normalization of dA, 
S,(6) = b(u). But first one proves 

(iii) If p is a Weyl group invariant polynomial on a* then 


pS, = p(w)S, 


Lie Groups and Symmetric Spaces 55 


To see this select a differential operator De D(G/K) such that Dd, = 
P(A)d, (see [27], p. 591 or [31], p. 432). Then 


p5,(b) = $,(pb) = f 8,(9( | pC2)b(Avb,(x) le(2)I~? aa) dx 
Here we replace p(A)p,(x) by (D¢@,)(x) and carry D over on @, by replacing 
it with its adjoint; the result is p(u)S,(6) as desired. 


As a fairly easy consequence of (iii) we obtain (cf. [27], p. 591). 
(iv) There exists a function y on a* such that 


S,(b) =y(w)b(u) be F(a"). 


Now we must prove that y is a constant. Consider for f as in (2) the 
function F, defined by 


F (a) = er(t8 | f(ia-o)di_ = aeA 
N 


Then we have as a simple consequence of (1) and Lemma 2.8 that 
FA) = [ SOG, dx = [ Faye" da (10) 
X A 
If b € ¥(a*) consider the function 
box) = { bAYb«Cx) le(A)|~? a 


The integral for F,, can be shown to converge and by the inversion formula 
for the Fourier transform on A and a* we obtain 


F,,(a) = eis? J pola -o) dn = | h,(Ayei*e® dd 


= [ Sy(b)e*8 da = | y(Ayb(Ae*8 a 


=A 1 isA(log a) 
=— [ xb ¥ da 


where w denotes the order of W. The relation y = w would therefore result 
from the following statement. 
(v) The relation 


Je(Ay|-7erCiee” | a0) dt = Y elsMione (11) 


holds in the weak sense in A, that is, it gives the right result when integrated 
against any be S#(a*). 


56 SIGURDUR HELGASON 


This is carried out by means of a beautiful analysis in §15, p. 597, of 
Harish-Chandra [27]. Here we have to settle for a vague plausibility 
argument. Writing na =k,a'k, (k,, k,¢K, a’e€A*) we have (loc. cit. 
p. 604) 

log a’ ~ log a + H(n) 
as a—>oo in A*. Since (4) is the dominating term in the expansion for 
@, (exp H) let us replace $,(na: 0) = ¢,(a’:0) by 
e — p(log a + H(h)) y c(sd)eislo a+ H(h)) 
sew 


When this expression is integrated over N we obtain from (5) the expression 


e Pllog a) y c(sA)e( a sA)eisMlog a) 
sew 


which equals e~°“'°8*) |e(A)|?_ e408 in accordance with (11). 


sew 
In order to deduce Theorem 3.1 from the relation S, (6) = (const)b(y) 
(b € £(a*)) we still have to prove the following statement. 
(vi) Each K-invariant function fe C,°(X) can be written in the form 


f(x) = | BAG) |e"? dae F(a") 


This was stated as a conjecture in Harish-Chandra [27], p. 612, and 
was finally proved by him in [28], p. 48. Since this proof involves so much 
work on the general Plancherel formula for G (in particular, the discrete 
series) it would not be feasible to describe it here. Instead let me outline a 
different effort [37] at proving (vi). 


Let F be a W-invariant function in C,*(A) and F* its Fourier transform 


F*(A) = | Fle Mee ) da 


Writing the expansion (3) as 
o,(exp H)=) ¥,,H) (Hea*) (12) 
7 
we assume that the term-by-term integration 
[ F*@oxexp Hy le(4)-? da =F | FHWA, Mle“? da (13) 
a* yp “at 
is permissible. Then we have (Joe. cit. p. 302). 
(vii) For Hea let |H|=B(H, H)'/*. Suppose R>O such that 
F(exp H) =O for |H|>R. Then 


| F*(AW,(A, H)|c(A)|72 dA =0 = for|H|>R (14) 


Lie Groups and Symmetric Spaces 57 


This is proved by translating the integration into the complexification 
a* + ia* by use of Cauchy’s theorem. Because of the formula (7) the func- 
tion c(A)~* can be extended to a function on a* + ia* with singularities, whose 
location can be determined. The functions y,(A, H) are determined by 
certain recursion formulas which result from @, being an eigenfunction of 
each De D(G/K). It is therefore possible to describe the sets of singularities 
of the functions ¥,(A, H) and the integration in a* can by Cauchy’s theorem 
be translated away from these sets. This leads to estimates of the integral, 
which prove (14). 

In order to prove (vi) let fe C,*(X) be K-invariant and let us use (14) 
on the function F(a) = F,(a), (ae A). We put 


g(x) = | F*A)$,(x) le(A)I-? da (15) 
and by (13) and (14) we have ge C,~(X) and K-invariant. On the other 
hand, we have by (10) and the result S,(6) = b(u) (with dA suitably normalized), 

G(A) = F,*(A) = F*(A) (16) 
The Euclidean Fourier transform F— F* is one-to-one so the last relation 
implies 

F(a) = F(a) = F(a) 


Thus, in view of (10), the function h = f — gis a K-invariant function in C,*(X) 
satisfying 


f hCdb09 dx =0 


for all complex-valued linear forms 2 on a*. It is well-known (see, for exam- 
ple, [31], p. 409, 453) that this implies ; = 0, so 


f(x) = | F* AG.) lel"? aa 


which gives (vi). 

What is lacking in this proof of (vi) is a justification of the term-by-term 
integration (13). In the quoted paper this justification is given for the case 
rank (G/K) =1; inthis case the proof also gives a Paley-Wiener type of 
theorem for the transform f— f, that is, an intrinsic characterization of the 
functions f(A) as f runs through the K-invariant functions in C.”(X). 

We conclude this section with a simple remark on the formulas 


Fa) =| f(a-0)$,(a)D(@) da 


fa-o) = | FA)b,(a)5(A) dd (A) = Je(A)|-? 


58 SIGURDUR HELGASON 


In analogy with the product formula (9) 
6(A) = |] 6,04") (17) 


one can prove (and this is an elementary result) that 
D(exp H) = [| D,(exp H”) (18) 


where D, is the D function for the space G*/K*, and H* is the projection of H 
on a*. It seems conceivable that a fuller understanding of the reason for 
the product formulas (17) and (18) might lead to a reduction of Theorem 3.1 
to the rank-one case. 


4-4 Fourier Transform on Symmetric Spaces 


As before let X¥ denote the symmetric space G/K. Now we would 
like to define a Fourier transform for arbitrary functions fe C.°(X), not just 
for the K-invariant ones. We motivate this by means of the definition given 
in §1-3 for the non-Euclidean disk D. In this case the group G equals SU(1, 1) 
and as calculated in §3-5 the group N consists of the group of matrices 


1+in -—in 
( in 1- a neR 
The orbit N-O consists of the points in/(in — 1), which clearly form a 
horocycle and it is a simple matter to verify that the horocycles in D are the 
orbits in D of all groups of the form gNg™'. 
Hence, we define for the general symmetric space X = G/K a horocycle 
to be an orbit in ¥ of a subgroup of G of the form gNg~', g being an arbitrary 


element in G. 
Lemma 4.1. The group G permutes the horocycles transitively. 


PROOF. The most general horocycle € is of the form € = gNg~'h- 0, g andh 
being fixed elements in G. By the Iwasawa decomposition we can write 
h~'g = kan and deduce (since aNa~! c N) that gNg"'h-o=hkN-o. In 
other words, the element hk € G maps the horocycle €, = N -o onto €, so the 
lemma ts proved. 

In particular, all the horocycles are submanifolds of X of the same 
dimension and since N 4 K = {e} the mapping nn - o is a diffeomorphism 
of N onto €,. 


Lemma 4.2. Each horocycle € can be written 
C=ka-¢, (1) 


where a € A is unique and the coset kM € K/M is unique. 


Lie Groups and Symmetric Spaces 59 


Although the proof of this lemma is not difficult we shall not stop to 
prove it here. For the case ¥ = D the lemma is quite obvious. 


Definition. The Weyl chamber kM in (1) is called the normal to the horocycle 
é; the element ae A in (1) is called the complex distance from o to €. 

Considering the example X = D the term “ normal”’ is quite reasonable; 
so is the term “complex distance” because the point ka: o is the unique point 
in € at minimum distance from o. (If a = exp H, Hea, the distance is 
BH, H)'!?, cf. [37], p. 306.) 

We recall now that given the maximal Abelian subspace ac p, the 
group N is determined following a choice of a Weyl chamber n* ca. 


Lemma 4.3. Let a,,...,a, denote the various Weyl chambers in a and 
N,,.-.-,N, the corresponding Iwasawa groups. Then the horocycles 
N,: 0,...,N,°0 all have the same tangent space at the point o. 


PROOF. The projection 2: G—>G/K given by z(g) =g:o maps N onto €, 
and the differential dx : g >(G/K), mapsnonto(é,),. Butthemapdz:p — 
(G/K), is an isomorphism so let q < p be the subspace which dz maps onto 
(€,),- We shall prove that the manifolds N+ o and A -o are orthogonal at 
o and since 


(Coo = an(q)  (A-0), = dn(a) 


it suffices, because of the choice of metric on G/K (§3-3), to prove B(q, a) = 0, 
that is, q and a are orthogonal with respect to B. But if Hea, X eq then 
there exists an X, en such that dx(X) = dn(X,). Thus X — X, €f so since 
Bia, f) = 0 and Bia, n) = 0, we obtain 


B(X, H) = B(X,, H) =0 


Thus each of the tangent spaces (N;-0), is perpendicular to the tangent 
space (A - 0), and since dim N -o + dim A: o = dim G/K, the lemma follows. 


Lemma 4.4. Given x € X, be B, there exists exactly one horocycle passing 
through x with normal b. 


PRooFr. Letb=kM. We must find a unique ae A such that x lies on the 
horocycle €=ka-&,. But xe& means x =kan-o for some néN so 
an-‘o=k~'+x. Thus, by the Iwasawa decomposition, a is uniquely deter- 
mined by k and x. 

We denote the horocycle determined by this lemma by &(x, 5) and write 
exp A(x, b) (A(x, 6) € a) for the complex distance from o to €(x, 6). We can 
now write down the analogs of the functions e“<*”? in §1-3. 


60 SIGURDUR HELGASON 


For 6 € Band J a complex-valued linear function on a, define the func- 
tion e, , by 


Qprxacae se xexX 


We state without proof two properties of e, ,, the second of which is 
trivial. 

(i) e, » is an eigenfunction of each operator D € D(G/K) 

(ii) e, , is constant on each horocycle with normal 6. A function on X 

with this property will be called a plane wave with normal b. 

One can also prove that these two properties characterize the functions 
e, , (if certain singular eigenvalue systems are excluded). In accordance with 
the definition in §1-3 we define Fourier analysis on the symmetric space X to 
be a decomposition of ‘arbitrary’? functions on X into functions of the 
form e, 4. 

As before let dx denote the volume element on X and 


1 
p=5 Y dim (a,x 


a>0O 


Let a* denote the dual of a, that is, the set of real linear functions ona. Then 
the following theorem holds (cf. [35]). 


Theorem 4.5. For fe C,*(X) define the Fourier transform f on a* x B by 


(A, b) = | f(xei~#+PAG YD) dy ss Lea*, be B 
X 
Then 
(x)= | [ F(A, bye fe(Ay|-? da db 
a* *B 
if the Euclidean measure dA on a* is suitably normalized. 
This theorem is proved by reducing it to Theorem 3.1 in a way which is 


similar to the reduction of Theorem 3.1, Ch. 1, to the inversion formula for 
the Mehler transform. That reduction made use of the geometric identity 


(t:z,t' b> = (z, b> + (t' 0,7: b> (2) 
and the formula 
d(t-b)} _ 2¢1- 1-0, by 
db |< @) 


valid for an arbitrary isometry t of the non-Euclidean disk D. 
The generalization of the formula (2) to the symmetric space X is 


A(g ‘x, g(b)) = A(x, b) + A(g- 0, g(b)) (4) 


forgeéG,xeX and be B. (Here the action of G on X and on B is denoted 
as in §2.) In order to prove (4) let x =hK,b=kM. Then 


h-oek exp A(x, b)N-o 


Lie Groups and Symmetric Spaces 61 
so for some n, EN, k, 6 K 
gh = gk exp A(x, b)n,k, 
which by the lwasawa decomposition can be written 
gh = k(gk) exp H(gk)n(gk) exp A(x, b)n,k, 
c< N (aé A), this relation implies 
g°x €k(gk) exp (H(gk) + A(x, b))N-0 

and since k(gk)M = g(kM), we conclude 

A(g° x, g(b)) = H(gk) + A(x, b) (5) 


On the other hand, we have by the definition of A(g: 0, kM) that for 
some n,6€N,k,€K, 


Since aNa™! 


g = kexp A(g:0, kM)n,k, 
SO 

H(g~'k) = — A(g-0, kM) (6) 
Hence, (5) becomes 

A(g - x, g(b)) = —A(g~*-0, b) + A(x, b) 

In particular, putting x = 0, we get A(g: 0, g(b)) = — A(g™': 0, b), so the 
desired formula (4) follows. The generalization of (3) to the space X is 
given by 
= e2P(A(g~ ** 0,b)) (7) 


d(g(b)) 
db 


and this of course is a direct Consequence of Cor. 2.9 and (6). Now the 
proof of Theorem 4.5 proceeds essentially as the proof of Theorem 3.1 in 
Ch. 1. 

Finally we observe that the Poisson integral representation of bounded 
harmonic functions on X (cf. (5) in §2) can be written 


u(x) = | eet »)A(b) db 


and is, therefore, according to our definition, to be regarded as a formula in 
Fourier analysis on X. 


4-5 Interpretation by Representation Theory; Eigen- 
functions of the Invariant Differential Operators 


Since the group G leaves the volume element dx on X invariant we get 
a unitary representation Ty of G on L?(X) by associating to each ge G the 
operator f> f*™™ on L?(X). (Here f™ denotes the function x > f(g~'- x).) 
We shall now indicate how Theorem 4.5 gives a decomposition of this repre- 
sentation into irreducible ones. 


62 SIGURDUR HELGASON 


For A € a* let §, denote the vector space 
9, = {ha(x) = e('4+eV AC ACh) db| he LB) 
B 


of functions on X. If A is regular, that is, sA # 4 for all s #e in the Weyl 
group W, one can use an irreducibility criterion of Bruhat [7], p. 193, to prove 
that the function h € L?(B) above is uniquely determined by/h,. If we define 
a Hilbert space norm on §, by 


1/2 
Vaal = { f ImCOyP a8) 
B 

then the mapping which assigns the operator h,(x) > h,(g~'- x) toeachg eG 
is by (4) and (7) seen to be a unitary representation T, of Gon §,. Using 
the irreducibility criterion cited, one can show this representation to be 
irreducible. Now with the notation of Theorem 4.5 there is a Plancherel 
formula, namely, 


[ fGoP dx = ff LPG, by? le(Ayi-? aa ab 
X a* ~B 


In terms of direct integrals of representations (see, for example, Dixmier 
[15]), Theorem 4.5 can therefore be written: 


L(X) = [Sale(\-2- da Ty = [ Tyle(A)|-? a 


A running through a* (mod W). 
The functions in §, are eigenfunctions of each Dé D(G/K). More 
generally, if T is an analytic functional on B and pe C the function 


{() = | et A(x.) AT (b) 
B 


is an eigenfunction of each D € D(G/K); it appears likely that for sufficiently 
general functionals T these functions constitute all the simultaneous eigen- 
functions of the operators D(G/K) (cf. Theorem 5.1, Ch. 1). 


4-6 Invariant Differential Equations on Symmetric Spaces 


We shall now discuss general existence theorems for invariant differen- 
tial equations on the symmetric space G/K. In order to motivate the method 
followed we first describe a well-known geometric method for solving differ- 
ential equations in R” with constant coefficients (Courant-Lax [14], Gelfand- 
Shapiro [20], John [44]). The basis of the method is a formula of Radon-John 
which in an explicit manner describes a function on R” by means of its integ- 
rals over the various hyperplanes in R”. 


Lie Groups and Symmetric Spaces 63 


For fe C.”(R") let f(w, p) denote the integral of f over the hyperplane 
(x, w) = p (here w is a unit vector and pe R and (,) the scalar product). 
The function f is called the Radon transform of f. 


Theorem 6.1. For the Radon transform f— f the following inversion formula 
holds: 


(x) = (AFP Fle, (x, w)) de) (1) 


for fe C.~(R"). Here A denotes the Laplacian, dw is the surface element on 
the unit sphere S”~', and c is a constant. 

For the proof see [44]. There the cases n = odd and n= even are 
presented in different forms; the unified version can be found in [34], p. 163. 

Formula (1) states that when for x € R” we form the integral of f over 
each hyperplane through x, then take the average of these integrals, and 
finally apply the operator A“~!/*, we recover the function f However, 
for the applications indicated, the important feature of (1) is an explicit decom- 
position of finto plane waves. (A plane wave is a function which is constant 
on each hyperplane with a given normal vector; this normal vector is then 
called the normal to the plane wave.) In fact, for any fixed we S"~' the 
function f,, : x > f(@, (x, w)) is a plane wave with normal w. 

We shall now apply formula (1) to differential equations. Let D bea 
differential operator on R" with constant coefficients and consider a differen- 
tial equation 


Du =f (2) 


where fe C,~(R") is a given function. We begin by considering the differen- 
tial equation 


Du = fi. (3) 


where /,, is as above and we look for a solution v which is a plane wave with 
normal w. But a plane wave with normal w is just a function of one variable; 
furthermore if v is a plane wave with normal w then so is the function Dv. 
Our problem of finding v of the specified type satisfying (3) is therefore just 
an ordinary differential equation with constant coefficients. Pick a solution 
u.,, and assume that this choice can be made smoothly in w. Then the func- 
tion 

u=cA@- i? Uy dw (4) 

sr 

is a solution of the equation (1). In fact, since differential operators with 
constant coefficients commute we have (at least for n odd) 


Du =c AM 1/2 Du,, dw = cA” V!/? | f.. dw =f 
gn-i gn-i 


64 SIGURDUR HELGASON 


This proof actually works also for n even. The weakness of the method lies 
in the assumption that u,, can be chosen so as to vary smoothly in w. In 
fact the example D = 67/0x,0x,, w = (1, 0) shows that u,, may not exist for 
all w. 

For a symmetric space X¥ = G/K the inversion formula for the Fourier 
transform (Theorem 4.5) does give a decomposition of an arbitrary function 
feC,*(X) into plane waves. In fact let as before 


f(A, b) = I aaa dx Aea*,beB 
and put 
ful) = | FA, bye toa ™ le(ayi-? da (5) 
Then f, (x) is a plane wave with normal 5 so the formula 


f(x) = J fle) db (6) 


does indeed give a decomposition of finto plane waves. We shall now apply 
this formula to the problem of solving a differential equation 


Du =f (7) 


where D is a given differential operator in D(G/K) and fe C.~(X) is a given 
function. First we need a simple lemma concerning the action of invariant 
differential operators on plane waves (cf. [27], p. 247, or [45]). 


Lemma 6.2. Let De D(G/K). Then there exists a unique differential opera- 
tor 6(D) on the submanifold A - oc X such that if bar denotes restriction to 
A‘ 90, 

DF = 6(D)F 
for every Fe C”(X) which is N-invariant (that is, a plane wave with normal 


a‘). This differential operator 5(D) is invariant under A. 


PROOF. Since the mapping (n, a:o0)—na-o is a diffeomorphism of 
N x (A: 0) onto X the existence and uniqueness of 6(D) is obvious. Hence 
we just have to prove its invariance under A. Let aeéA and, as before, 
if Fe C°(X) let F™ denote the function x > F(a~!- x)on X. If Fis invariant 
under WN then the function F*™ is too; in fact, 


F'(n+x) = F(a~'n-x) = F(n,a7'+x) 


for some n, EN. Thus F™(n+ x) = F™(x), and of course F™ = (F)*), 


Lie Groups and Symmetric Spaces 65 
Thus, 
(5(D)" F) = (5(D)\(FY YY" = (5( D) FR Dyea) 

= (DFO = (DEY) = DF = 6(D)F 


This proves the lemma because each function in C~(A - 0) can be extended to 
an N-invariant function in C”(X). 

In order to solve the differential equation (7) we begin by considering 
the differential equation 


Dv =f, (8) 


for an arbitrary be B. We look for a solution v = v’ which like the function 
Ff, (cf. (5)] is a plane wave with normal 6. For example, consider the case 
b=a*. Then the function f, is invariant under N and so is the required 
function v’. According to Lemma 6.2, the differential equation Dv? = f, on 
X amounts to the differential equation 


5(D)v’ = f, (9) 


which is by the A-invariance of 6(D) a differential equation with constant 
coefficients on the Euclidean space A-o. But by a result of Ehrenpreis [16] 
and Malgrange [52], a differential operator on R” with constant coefficients 
maps the space C*(R") onto itself. Hence a solution v = v’ exists. Now we 
assume that v’ can be chosen so that it depends smoothly on b. Then we put 


u(x) = | v(x)db xeEXx 
B 
and have 


Du = [ De® db = | fo db =f 


This is not an existence proof for the differential equation (7) because of the 
smoothness assumption about v? (see, however, Tréves [59], p. 131). Never- 
theless, we have the following general theorem (Helgason [33], p. 577-578). 


Theorem 6.3. Let D #0 be an arbitrary G-invariant differential operator on 
the symmetric space G/K. For each fe C.~(G/K) the differential equation 
Du = f has a solution ue C%(G/K). 

It suffices to find a distribution T on X satisfying the differential equa- 
tion DT = 6, where 0 is the delta-distribution at the origin o. In fact, the 
desired solution is then u = f x T, where x is the operation on distributions 
on X which is induced by the convolution product of distributions on G. 
Since D and 6 are K-invariant we look for a K-invariant 7. For this we use 
the transform f— F, discussed in §3. As proved in Harish—Chandra [28], 


66 SIGURDUR HELGASON 


p. 46, this transform is one-to-one on the space /(X) of K-invariant, square- 
integrable functions on X which are rapidly decreasing on X in a certain 
technical sense, and the transform maps /(X) into the space [(A) of Weyl 
group invariant functions on A which are rapidly decreasing on A (considered 
as a Euclidean space). On the other hand, it is proved in Helgason [33] that 
the range of the mapping f— F, (fe /(X)) is precisely /(A) and furthermore, 
Fp, = y(D)F,, where y(D) is a certain constant-coefficient differential operator 
on A. The isomorphism f— F, of (X) onto (A) has a transpose, mapping 
the dual /'(A) of J(A) onto the dual /’(X) of 1X). Under this isomorphism 
the differential equation DT = 6 on X is transformed into a differential equa- 
tion for tempered distribution on A, and this last differential equation has 
constant coefficients since y(D) does. But by a theorem of Hérmander [40] 
and Lojasiewicz [49] any differential operator on R"” with constant coeffi- 
cients maps the space of tempered distributions on R" onto itself. This 
leads to the desired distribution JT on X, proving the theorem. 


4-7 The Wave Equation on Symmetric Spaces 


We shall now discuss a different method for solving differential equa- 
tions on the symmetric space X. It uses the Radon transform on X which we 
now define. Let = denote the set of all horocycles in XY. For feC.”(X) 
we define the function f on = by 


(=| f)do(x) es) (1) 


where do is the volume element on €. (The Riemannian structure on Y 
induces in an obvious way a Riemannian structure on the submanifold €.) 
The function f is called the Radon transform of f. 

If x e X the (compact) subgroup K, of G which keeps x fixed permutes 
the horocycles through x transitively. For x = o this is obvious from Lemma 
4.2 and in general it follows by the homogeneity of X. The set of horocycles 
passing through x has a unique normalized measure, say v, invariant under 
K,. 

If @ is a function on E the function ¢ on X is defined by 


Bod= J HG) dul) (2) 


Theorem 7.1. Suppose all Cartan subgroups of G are conjugate. Then for 
a certain fixed differential operator (] e D(G/K) 


f= Oh’) e C,°(G/K) (3) 


Lie Groups and Symmetric Spaces 67 


This formula is analogous to the inversion formula of Radon—John 
(Theorem 6.1) for the case of an odd-dimensional Euclidean space. The 
even-dimensional Euclidean case corresponds here to the existence of non- 
conjugate Cartan subgroups and in this case (3) still holds in a slightly modi- 
fied form (cf. [35], p. 759). We emphasize that the differential operator [J 
can be written down quite explicitly. 

By means of (3) one can write down a solution of the wave equation 
on X, 


07u 
—=-+A 4 
ot? (4) 
with initial data 
% 
u(x, 0) = 0 {= ite | = f(x) (5) 


Here A denotes the Laplace—-Beltrami operator on X and f is an arbitrary 
given function in C,°(X). 

In the notation of §1, let (] e D, (G) be an operator satisfying (I /)~ = 
Cif for all fe C°(G/K). Let |p| denote the norm of the linear form p, and 
let dn be a Haar measure on N which corresponds to the volume element do 
on €, = N-o under the diffeomorphism n-n-o0. Let A, denote the Lap- 
lacian on the Euclidean space A. 


Theorem 7.2. The solution to the wave equation (4) with initial data (5) is 
given by 


u(g:o,t)= Cy( [Maa t) ak) (6) 


where V, , is the solution to the equation for damped waves on A x R, 


2 


4) 
(An ~ lol? )Vx. 4 Pr at2 Vig (7) 


0 
Yoda 0)=0,  [EV,,g(ad} = ert, a) 
’ ot ’ oY; ’ 


where 


F,, (a) = iq f(gkan + 0) dn 


Although the verification of this theorem is not long (cf: [32], p. 688) 
we omit it here because it requires some further preparation. The function 
V,, is given as a convolution of a certain Bessel function with F, , so the 
solution (6) is explicitly given in terms of the initial data f(x). 


68 SIGURDUR HELGASON 


Huygens’ Principle 


Let M be an analytic pseudo-Riemannian manifold with Lorentzian 
signature, in short, a Lorentzian manifold. Since our considerations will 
be local we assume that M is convex, that is, any two points in M can be 
joined by a unique geodesic. The geodesics of zero length through a point 
péM generate the light cone C, in M with vertex p. A submanifold Sc M 
is called spacelike if each tangent vector to S is spacelike. Let A denote the 
(hyperbolic) Laplace-Beltrami operator on M, and suppose now that a 
Cauchy problem is posed for the wave equation Au = 0 with initial data on 
a spacelike hypersurface Sc M. Hadamard proved that the value u(p) of 
the solution at a point pe M only depends on the initial data on the piece 
S* < S which lies inside the light cone C,. Huygens’ principle (in the strong 
sense) is said to hold for Au = 0 if the value u(p) only depends on the initial 
data in an arbitrary small neighborhood of the edge s of S*,s=C,S. It 
is known that this is a property of the space M and does not depend on the 
particular choice of Sc M. The wave equation 


O7u ss 0*u 07u 


for an odd-dimensional R"~' satisfies Huygens’ principle. A conjecture, 
attributed to Hadamard, was that these were essentially the only second-order 
hyperbolic equations satisfying Huygens’ principle. A counter-example of 
the form Au + cu = 0 (n = 6) was given by Stellmacher [58] in 1953, and in 
1965, P. Giinther [23] gave a whole series of counter-examples for the pure 
equation Au =0 (n=4). These are based on Hadamard’s criterion that 
Huygens’ principle holds if and only if n is even and 2 4 and the logarithmic 
part of the fundamental solution (in Hadamard’s sense) vanishes. 

If M is symmetric the evidence available seems to indicate that ‘‘ Hada- 
mard’s conjecture’’ might hold for the pure equation Au=0. For M of 
constant curvature (a “de Sitter space” or an “‘anti de Sitter space’’) this is 
indeed so (cf. [29], p. 296; see also [13].) The answer is also affirmative if M 
has the form M = M, x R, where M, has dimension 3 and constant curva- 
ture (H6lder [39]). Finally the answer is affirmative if M = X x R, where X 
is a symmetric space whose group of isometries is a complex semisimple Lie 
group (Helgason [33], p. 582). 


REFERENCES 


1. V. Bargmann, Irreducible Representations of the Lorentz Group, Ann. of 
Math. 48 (1947), 568-640. 

2. F. A. Berezin, An Analog of Liouville’s Theorem for Symmetric Spaces of 
Negative Curvature, Dokl. Akad. Nauk SSSR 125 (1959), 1187-1189. 


Lie Groups and Symmetric Spaces 69 


3. M. Berger, Les espaces symétriques non compacts, Ann. Sci. Ecole Norm. 
Sup. 74 (1957), 85-177. 


4. Bhanu-Murthy, Plancherel’s Measure for the Factor Space SL(n, R)/SO(n), 
Dokl. Akad. Nauk SSSR 133 (1960), 503-506. 

5. Bhanu-Murthy, The Asymtotic Behavior of Zonal Spherical Functions on 
the Siegel Upper Half-Plane, Dokl. Akad. Nauk SSSR 135 (1960), 1027-1030. 

6. N. Bourbaki, Espaces vectoriels topologiques, I-II, Hermann, Paris, 1953. 

7. F. Bruhat, Sur les représentations induites de groupes de Lie, Bull. Soc. 
Math. France. 84 (1956), 97-205. 


8. E. Cartan, Sur une classe remarquable d’espace de Riemann, Bull. Soc. 
Math. France. 54 (1926), 214-264, 55 (1927), 114-134. 


9, E. Cartan, Sur certaines formes riemanniennes remarquables des géométries 
a group fondamental simple, Ann. Sci. Ecole Norm. Sup. 44 (1927), 345-467. 


10. E. Cartan, Groupes simples clos et ouverts et géométrie riemannienne, 
J. Math. Pures Appl. 8 (1929), 1-33. 


11. E. Cartan, La théorie des groupes finis et continus et l’Analysis situs, Mém. 
Sci. Math. fasc. XLII (1930). 


12. C. Chevalley, Theory of Lie Groups, Vol. I. Princeton University Press, 
1946. 


13. Y. Choquet-Bruhat, Sur la théorie des propagateurs, Ann. Mat. Pura Appl. 
4 (64) (1964), 191-228. 


14. R. Courant and A. Lax, Remarks on Cauchy’s Problem for hyperbolic 
Partial Differential Equations with Constant Coefficients in Several Independent 
Variables, Comm. Pure Appl. Math. 8 (1955), 497-502. 


15. J. Dixmier, Les algébres d’operateurs dans lespace hilbertien, Gauthier- 
Villars, Paris 1957. 


16. L. Ehrenpries, Solutions of some Problems of Division, Am. J. Math. 76 
(1954), 883-903. 


17. A. Erdélyi et al. Higher Transcendental Functions (Bateman Manuscript 
Project) McGraw-Hill, New York 1953. 

18. V. A. Fok, On the Expansion of Arbitrary Functions in Integrals according 
to Legendre functions with arbitrary indices, Dokl. Akad. Nauk SSSR 49 (1943), 
279-282. 


19. H. Furstenberg, A Poisson Formula for Semisimple Lie Groups, Ann. Math 
77 (1963), 335-386. 


20. I. M. Gelfand and S. J. Shapiro, Homogeneous Functions and Their 
Applications, Am. Math. Soc. Transl. 8 (1958), 21-85. 


21. S. G. Gindikin and F. I. Karpelevi¢, Plancherel Measure of Riemann Sym- 
metric Spaces of Nonpositive Curvature, Soviet Math. 3 (1962), 962-965. 


22. R. Godement, Une généralisation du théoréme de la moyenne pour les 
fonctions harmoniques, C. R. Acad. Sci. Paris 234 (1952), 2137-2139. 


22a. R.Godement, Séminaire Bourbaki (1957). 


70 SIGURDUR HELGASON 


23. P. Giinther, Ein Beispiel einer nicht trivialen huygensschen Differential- 
gleichung mit vier unabhangigen Variablen, Arch. Rat. Mech. and Anal. 18 (1965), 
103-106. 

24. P. R. Halmos, Measure Theory, Van Nostrand, Princeton, New Jersey, 
1950. 

25. Harish-Chandra, Representations of Semisimple Lie Groups I, Trans. Am 
Math. Soc. 75 (1953), 185-243. 

26. Harish-Chandra, On a Lemma of F. Bruhat, J. Math. Pures Appl. 35, 
(1956), 203-210. 

27. Harish—Chandra, Spherical Functions on a Semisimple Lie Group I, II. 
Am. J. Math. 80 (1958), 241-310, 553-613. 

28. Harish—Chandra, Discrete Series for Semisimple Lie Groups II, Acta Math. 
116 (1966), 1-111. 

29. S. Helgason, Differential Operators on Homogeneous Spaces, Acta Math. 
102 (1959), 239-299. 

30. S. Helgason, Some Remarks on the Exponential Mapping for an Affine 
Connection, Math. Scand. 9 (1961), 129-146. 

31. S. Helgason, Differential Geometry and Symmetry Spaces, Academic Press, 
New York, 1962. 

32. S. Helgason, Duality and Radon Transform for Symmetric Spaces, Ay. 
J. Math. 85 (1963), 667-692. 

33. S. Helgason, Fundamental Solutions of Invariant Differential Operators 
on Symmetric Spaces, Am. J. Math. 86 (1964), 565-601. 

34. S. Helgason, The Radon Transform on Euclidean Spaces, Compact Two- 
Point Homogeneous Spaces and Grassmann Manifolds, Acta Math. 113 (1965), 
153-180. 

35. S. Helgason, Radon-Fourier Transforms on Symmetric Spaces and Rela- 
ted Group Representations, Bull. Am. Math. Soc. 71 (1965), 757-763. 

36. S. Helgason, A Duality in Integral Geometry on Symmetric Spaces, Proc. 
U.S.-—Japan Seminar in Differential Geometry, Kyoto, June 1965, Nippon Hyoron- 
Sha, Tokyo. 

37. S. Helgason, An Analog of the Paley-Wiener Theorem for the Fourier 
Transform on Certain Symmetric Spaces, Math. Ann. 165 (1966), 297-308. 

38. S. Helgason and A. Koranyi, A Fatou-Type Theorem for Harmonic 
Functions on Symmetric Spaces, Bull. Am. Math. Soc. (1968) (to appear). 

39. E. Holder, Poissonsche Wellenformel in nichteuklidischen Raéumen, Ber. 
Verh. Sachs. Akad. Wis. Leipzig 99 (1938). 

40. L. Hormander, On the Division of Distributions by Polynomials, Ark. Mat. 
3 (1958), 555-568. 

41. G. Hunt, A Theorem of Elie Cartan, Proc. Am. Math. Soc. 7 (1956), 307- 
308. 

42. S. It6, Unitary Representations of Some Linear Groups II, Nagoya Math. 
J. 5 (1953), 79-96. 


Lie Groups and Symmetric Spaces 71 


43. N. Jacobson, Lie Algebras, Interscience, New York, 1962. 

44. F. John, Plane Waves and Spherical Means, Applied to Partial Differential 
Equations, Interscience, New York, 1955. 

45. F. I. Karpelevic, Orispherical Radial Parts of Laplace Operators on 
Symmetric Spaces, Soviet Math. 3 (1962), 528-531. 

46. F.1. Karpelevic, Geometry of Geodesics and Eigenfunctions of the Laplace- 
Beltrami Operator on Symmetric Spaces, Trudy Moscov. Mat. Obsé. 14 (1965), 
48-185. 

47. S. Kobayashi and T. Nagano, On Filtered Lie Algebras and Geometric 
Structures, J. Math. Mech. 13 (1964), 875-908. 

48. G. Kothe, Die Randverteilungen analytisher Funktionen, Math. Zeitschr. 
57 (1952), 13-33. 

49. S. Lojasiewicz, Sur le probleme de division, Studia Math. 18 (1959), 87-136. 

50. L. Loomis, Abstract Harmonic Analysis, Van Nostrand, Princeton, New 
Jersey, 1953. 

51. G. W. Mackey, Induced Representations of Locally Compact Groups I, 
Ann. Math. 55 (1952), 101-139. 

52. B. Malgrange, Existence et approximations des solutions des équations aux 
dérivées partielles et des équations de convolution, Ann. Inst. Fourier Grenoble 6 
(1955-56), 271-354. 

53. C. C. Moore, Compactifications of Symmetric Spaces, Amer. J. Math. 86 
(1964), 201-218. 

54. G. D. Mostow, A New Proof of E. Cartan’s Theorem on the Topology of 
Semisimple Groups. Bull. Am. Math. Soc. 55 (1949), 969-980. 

55. S. B. Myers and N. Steenrod, The Group of Isometries of a Riemannian 
Manifold, Ann. of Math. 40 (1939), 400-416. 

56. G. Schiffmann, Frontiéres de Furstenberg et formules de Poisson sur un 
groupe de Lie semi-simple, Séminaire Bourbaki 16 (1963-64). 

57. L. Schwartz, Théorie des distributions I, I], Hermann, Paris, 1950, 1951. 

58. K. Stellmacher, Ein Beispiel einer huygensschen Differential-gleichung, 
Gott. Nachr. 1953 H.10. 

59. F. Tréves, Equations aux dérivées partielles inhomogénes a coefficients 
constants dépendant de parametres, Aun. Inst. Fourier, Grenoble 13 (1963), 123-138. 


Special Functions and 
Representations of Lie Groups 


LEON EHRENPREIS 


Most of the materials covered in this seminar will appear in 
Fourier Analysis in Several Complex Variables by L. Ehrenpreis 
(to be published). 


72 


Commutativité de l'algébre des 
opérateurs différentiels invariants 
sur uN espace symétrique 


ANDRE LICHNEROWICZ 


Opérateurs différentiels sur une Variété 4 Connexion linéaire 73 
Espaces homogeénes et Opérateurs adjoints 75 

Notion d’Espace symétrique 77 

Symeétrie par Rapport a un Point 78 

Opérateurs adjoints sur un Espace symétrique 80 

Transformé par Symétrie d'un Opérateur de 7 (G/A) 8] 
Théoréme de Commutativitée 82 

Bibliographie 83 


AAA BwWHDN — 


Si V, = G/H est un espace homogéne de Lie, soit 2(G/H) lalgébre 
des opérateurs différentiels sur les scalaires invariants par G. Le but de cette 
conférence est de montrer que si G/H est un espace symétrique (non néces- 
Sairement riemannien) admettant un élément de volume invariant, Valgébre 
@(G/H) est commutative. Ce résultat a été établi par Selberg [3] dans le 
cas des espaces riemanniens symétriques et par moi-méme [2] sous I’hypothése 
plus générale faite ici. 


1 OPERATEURS DIFFERENTIELS SUR UNE VARIETE 
A CONNEXION LINEAIRE 


(a) Soit V, une variété différentiable’ munie d’une connexion linéaire 
W. L’ensemble des fonctions f définies sur V, a valeurs réelles et des classe 
C” constitute une algébre C~(V,,) sur le corps des réels. 


1 VY, est supposée paracompacte, de dimension k, de classe C“. On pose da = 
0/ex* (a =1,..., 7). 


73 


74 ANDRE LICHNEROWICZ 


Soit D un opérateur différentiel d’ordre N sur V,. Sur un voisinage 
ouvert U rapporté a des coordonnées locales arbitraires, on peut écrire, en 
mettant en évidence dans D le polynéme de derivation d’ordre maximal 


[Df\(x) = — tiny" On, *** Oay (x") + O,(0,) f(x") (x EU) (1-1) 


Dans (1-1), f désigne la fonction fe C°(V,) écrite dans le systéme de co- 
ordonnées locales envisagé, les f(y) sont supposés symétriques par rapport a 
leurs indices et le polynédme Q, est de degré inférieur 4 N. En se plagant 
dans Il’intersection des domaines de deux systémes de coordonnées locales, on 
voit que les 7/4," *" constituent un systéme de composantes locales d’un ten- 
seur symétrique contravariant d’ordre N de V, que nous désignons par ty). 

Soit V lopérateur de dérivation covariante dans la connexion W. 
L’opérateur différentiel defini sur V,, par: 


fe C*(V,) > LDf](x) — = Lt" Vay F(x) = (xe U) 


est d’ordre inférieur 4 N. On en déduit par récurrence qu’il existe sur V,, 
des tenseurs symétriques contravariants ¢,,)(q = 0, 1,..., N) tels que: 


[Df ](x) = i =" Va, £](x) (1-2) 


Désignons par le symbole (,) le produit intérieur divisé par q! d’un 
g-tenseur contravariant par un q-tenseur covariant. II existe ainsi des 
q-tenseurs symétriques contravariants ¢,,) tels que D soit défini par: 


D: fe C*(V,) + Df = ¥ (tas VE C°M,) (1-3) 


(b) Soit p une transformation (ou automorphisme) de la variété différen- 
tiable V,. Nous désignons par p’ l’application linéaire tangente; p’ opére 
sur les tenseurs contravariants. A p correspond aussi une action notée p* 
sur les tenseurs covariants; en particulier p*/( fe C°(V,)) n’est autre que la 
fonction composée fop. Nous disons que p est une transformation affine de 
V,, si elle laisse invariante la connexion W. 

Un opérateur D sur V,, est invariant par la transformation p si: 


p*D fats Dp* 


Soit p une transformation affine de V,. D’aprés (1-3): 


N 
p*Df = 2 (Pts p* V4f) 
.— 


Opérateurs Différentiels Invariants sur un Espace Symétrique 75 


La connexion étant invariante par p, p* commute avec la dérivation covariante 
et l’on a: 


N 
p*Df = 2 (P'tays V1p*f) (1-5) 
.- 
D’autre part 
N 
Dp*f = 2 (hays V'p*f) (1-6) 
.- 


En considérant des fonctions f telles que, en x, f(x) = 0, [V4f](x) = 0 pour 
q<N-—1, puis N —2, ..., on voit sur (1-5), (1-6) que pour que D soit 
invariant par p il faut et il suffit que p’t\,) = t(4)(q =9,1,...,.N). Ajnsi 


Théoreme. Etant donnée une variété différentiable V, munie d’une connexion 
linéaire, pour qu’un opérateur différentiel D sur V, 


D: f—>Df= LX (te , Vif) 


soit invariant par une transformation affine p, il faut et il suffit que les tenseurs 
symétriques contravariants f,,) associés a D soient eux-mémes invariants 


par p. 


2 ESPACES HOMOGENES ET OPERATEURS ADJOINTS 


(a) Soit G un groupe de Lie connexe, H un sous-groupe fermé de G. 
Nous désignons par G/H l’ensemble des classes 4 gauche gH de G et par p la 
projection canonique 


G G 
>Go— 
pee A 
qui a tout géG associe la classe 4 gauche pg=gH. Le groupe G opéere 


transitivement sur G/H: si ye G et x = gH EG/H, l’image de x par y, soit 
K,(x), est donnée par la translation a gauche L, par y, soit: 


K,(x) = ygH 
ou 
K,(pg) = p(L,g) 
Ainsi K, se trouve défini par: 
K,op=poL, (y € G) (2-1) 


Si Xo = pe = H, on peut écrire par abus de notation x = gH = K,(x9) sous 
la forme x = gXy.- 


76 ANDRE LICHNEROWICZ 


On démontre que G/H admit une structure unique de variété analytique 
réelle telle que G opére analytiquement par (2-1) sur G/H, c’est-a-dire telle 
que l’application de G x G/H sur G/H définie par {y, gH} —> K,(gH) soit 
analytique, la projection p étant alors une application analytique. Nous 
appelons espace homogeéne de Lie \a variété analytique réelle V, = G/H sur 
laquelle G opére transitivement et analytiquement; H est le groupe d’isotropie 
en Xo. 

Un opérateur différentiel D sur V,, = G/H est invariant par G si, pour 
tout y € G, il est invariant par la transformation K,, c’est-a-dire si: 


K,*D = DK,* (2-2) 

Nous désignons par D(G/H) !’algébre des opérateurs différentiels sur G/H 
invariants par G. 

(b) Considérons un espace homogéne de Lie V, = G/H astreint seule- 


ment a admettre un élément de volume n invariant par G. Si intersection des 
supports de fi1), fi2)€ C°(V,,) est compacte, nous posons: 


(fay fad = | ffasn) (2-3) 


(2-3) est défini pour tout f(,)€ C°(V,) sifi2) € C.°(V,), algébre des fonctions 
de C“(V,,) a support compact. 

Nous appelons opérateur adjoint d’un opérateur différentiel D de V,, 
un opérateur différentiel D tel que si f(,)¢ C°(V,) et si fiz) €C,(V,) on ait 


CDfi1) »S(2)? a «fay ) Df2)> (2-4) 


Un tel opérateur existe et est uniquement déterminé par la condition (2-4). 
Soit D, D’ deux opérateurs différentiels de V,, admettant les adjoints D et D’. 
On a d’aprés (2-4): 


CDD'fiiy, fay a CD'firy, Df.2)> sae (Says D'Df2)> 
I] en résulte que D~ D’ admet l’opérateur adjoint 
(D+ D’')~ = D'D (2-5) 


(c) Supposons De A(G/H). Il en est alors de méme pour D. De 
invariance de 7 il résulte, en effet, qui si ye G 


CKF-1 fay »Si2)> = <fay ’ K,*fay> (2-6) 
A l’aide de (2-6), évaluons: 
(K*. .Dfisy, fi2y> = (Df); K,*fay> al <fay> DK,*fi2)> 
D’autre part: 


(DKF 1 fay» fay) = (Ky fay, Bfay> = “fay, K,* Dfay 


Opérateurs Différentiels Invariants sur un Espace Symétrique 77 


D étant invariant par G, les premiers membres des deux relations précédentes 
sont égaux. Il en résulte que, pour toute f(,,¢ C”(V,): 


<(DK,* -— K}*D) fay, fay = 0 
Ainsi pour toute f(2) € C,°(Vn): 
DK,* — K,*D)f(., = 0 
ce qui entraine DK, =K, D. 
Théoreme. Si G/H est un espace homogéne admettant un élément de volume 


invariant, l’opérateur adjoint D d’un opérateur différentiel invariant D de 
G/H est lui-méme invariant. 


3 NOTION D’ESPACE SYMETRIQUE 


(a) Supposons qu’il existe sur un groupe de Lie connexe donné G, un 
automorphisme involutif S de G. S étant un automorphisme, quels que soient 
gg €G 

S(gg') = S(g)S(g') (3-1) 
S étant involutif 
S? =Id (3-2) 


Soit H* le sous-groupe des éléments de G fixes par S. C’est un sous-groupe 
fermé de’G dont nous désignons par H,° la composante connexe de I’unité. 


Définition. Un espace homogéne V, = G/H admet une structure d’espace 
symétrique sil existe un automorphisme involutif S de G tel que: 


H,,cHcH 


Si H, est la composante connexe de l’unité de H, on a H, = H,°. Nous 
pouvons supposer, sans nuire a la généralité, que G est effectif sur G/H. Les 
espaces riemanniens symétriques sont un cas particulier des espaces symé- 
triques au sens précédent. 

(b) L’automorphisme involutif S de G induit un automorphisme 
involutif S de ’algébre de Lie Gde G. Dela relation 


S(hgh-') = hS(g)h7! (géG,heH) (3-3) 
on déduit: 
S ad(h) = ad(h)S heH 


Les valeurs propres de l’automorphisme involutif S de G (S* = Jd) sont 
nécessairement égales a +1. Considérons le sous-espace de G defini par 


78 ANDRE LICHNEROWICZ 


les vecteurs propres de S correspondant a la valeur propre 1. A tout élément 
de ce sous-espace correspond un sous-groupe a un paramétre d’éléments de 
G fixes par S. Inversement a tout sous-groupe a un paramétre de Hy 
correspond un élément de G invariant par S. Ainsi le sous-espace envisagé 
n’est autre que la sous-algébre H de Hy ou H. Soit M le sous-espace de G 
défini par les vecteurs propres de S correspondant a la valeur propre — 1; 
M est supplémentaire de H dans G, 


G=H@M (3-4) 
et d’aprés (3-3) 

ad(H)M <M (3-5) 
Si A, we M, on a: 

ST, uJ = [SA, Su] = 1, J 

et [A, ny] €H. Le sous-espace M veérifie donc: 

[M,M]cH (3-6) 
La décomposition (3-4) de G qui satisfait (3-5) définit sur G/H une structure 
dite d’espace homogéne réductif. On montre que, de par son invariance par 
H, le sous-espace M permet de definir sur l’espace symétrique V, = G/H une 
connexion linéaire canonique W invariante par G et que, d’aprés (3-6), cette 


connexion est sans torsion. On montre de plus que tout tenseur invariant par 
G est a dérivée covariante nulle dans cette connexion. 


4 SYMETRIE PAR RAPPORT A UN POINT 


A partir de S, on peut défini des transformations de l’espace symétrique 
V,, = G/H; chacune de ces transformations laisse fixe un point de V,,. 
(a) Si Xo = pe, associons a tout élément g de G, le point 
y = S(g)xo € V, (4-1) 
A l’élément gh, ot h € H, est associé le point: 
S(gh)xo = S(g)S(h)xo = S(g)hxo = S(g)xo = y 


Ainsi le point y défini par (4-1) ne dépend que de la classe a gauche gH de g, 
c’est-a-dire du point x = gX) de V,. Nous désignons par S,, la transforma- 
tion xy de V,, transformation qui laisse fixe le point x, et qui vérifie 
d’aprés sa définition: 


S,,°P=poS (4-2) 


Opérateurs Différentiels Invariants sur un Espace Symétrique 719 


On a manifestement: 

Si,° P= P 
et S,, est un transformation involutive de V,(S2, = Id); S,, est dite la symétrie 
par rapport a Xo. 


Pour y € G, composons S,, avec K,. Pour x = gX9, on a: 


K,[S,,(x)] a K,[Ksg(Xo)] = K ys(g)(Xo) = K stscy)9\(Xo) = Sxl Ksy)(X)] 


Ainsi, pour tout y € G, 
K fo) Dy = Sxo fo) Ks) (4-3) 


Y 


En particulier, pour he H 
K,, ° Sx = Son ro) K,, (4-4) 


(b) Etant donné un élément g, de G, considérons la transformation 
de V,, définie par: 
K,, o Sy, 0 K, -1 


1 


Si a g,, on substitute l’élément g,h (h € H), on obtient: 


K,, o Kyo Sy, 0 Ky-: o K,,-: = K,, o Sx, 0K, -1 


d’aprés (4-4). Ainsi a tout point x; = g;x 9 € V, on peut associer la trans- 
formation S,,, ne dépendant que de x,, définie par 


Sy, = K,, y Sx0 ? K,,-1 = Sx0 2 Ksvg1)91 et (4-5) 


D’aprés sa définition, S,, laisse fixe le point x, et est involutive; elle est dite 
la symétrie par rapport a x. 
(c) De (4-2) on déduit 


(Si.Jx0° P = p' oS 


ou (S;,)x, est ’'automorphisme de l’espace vectoriel tangent T,,, en x9 a V,, 
défini par S,,. SideM 


(Sxo)xo(P'A) = p'(SA) = — p’a 
Ainsi a tout vecteur p’A de T,,,, (S;,,)x, fait correspondre le vecteur opposé; 


(Sxo)xo = ~ Id (4-6) 


Xo 


On établit d’autre part que /a connexion canonique d’une espace symétrique 
est invariante pour S;,,, donc pour toute symétrie. On retrouve ainsi aisément 
la construction du symétrique par rapport a x, d’un point d’un voisinage 
normal de centre x) au moyen des arcs géodésiques de la connexion cano- 
nique; il en résulte, en particulier, que S,, admet x) comme point fixe isolé. 

Ces résultats, et en particulier (4-6), s’étendent immédiatement a la 
symétrie par rapport a un point arbitraire. 


80 ANDRE LICHNEROWICZ 


5 OPERATEURS ADJOINTS SUR UN ESPACE SYMETRIQUE 


Nous considérons dans toute la suite un espace symétrique V,, = G/H 
admettant un élément de volume n invariant par G. 

(a) Le tenseur invariant n est a dérivée covariante nulle dans la con- 
nexion canonique W. Dans un domaine U de coordonnées locales (x*), on a: 


n= Qy dx! n dx* n*+*+ A dx"=Qgydx  (gy>0) 
Désignons par I's, les coefficients sur U de la connexion canonique. II vient: 


6 
v5 Nas Cae eae ( ads ~ 3, oe. ee 0 
Pu 


La connexion canonique W étant sans torsion, on obtient sur U: 


0 
ye = Te =o (5-1) 


Désignons par 6 l’opérateur de codérivation sur les tenseurs contravariants 
defini par: 
6:8  —V, TH (5-2) 


Si V est un champ de vecteurs a support relativement compact dans U, 
6V = —-V, V8 = —(6,V° +18, V’)= — ~ 0(V pu) 
et par suite: 
|) avn oe J) (eu) dx =0 


On en déduit par un procédé standard que, pour tout champ de vecteurs V 
a support compact dans V,, 
d6Vyn =0 (5-3) 
Vn 

(b) Soit 7 un p-tenseur contravariant, W un (p — 1)-tenseur covariant 
tels que l’intersection de leurs supports soit compacte. Si V est le vecteur 

défini en coordonnées locales par: 
p 


Tez ae yy 


a2 eee ap 


~ (p-D! 


on obtient par intégration par parties: 


[Wyn = 2) avn += (67, Wyn 


Opérateurs Différentiels Invariants sur un Espace Symétrique 81 


soit, d’apres (5-3), 


1 
{| (LVWyn == | (67, Wn (5-4) 
Vn PV, 
(c) Soit D un opérateur différentiel de D(G/H). Si fiyyEC°(V,), 
on peut écrire: 
N 
Df) = (tay VF) (5-5) 


q=0 


ou les tenseurs symétriques contravariants t(,) sont invariants par G (théoréme 
du §1). Pour f(.,)¢€ C,*(V,) on a: 


N 
Df)» fia? = om i (t(4) Si2)> Vfi1y)N 


En appliquant q fois la relation (5-4) au terme d’indice q, il vient: 


N 1 
<Dfry» Say? =e J, qi (Sry > Otay f2)) 


Il en résulte que D admet Il’opérateur adjoint défini par: 
Z N 1 5 
D = — Jt 
f2) 2, 7 (tq) S(2)) (5-6) 


L’espace étant symétrique, les ¢,,) sont a dérivée covariante nulle dans W et 
d’aprés la definition de 6, on a simplement: 


N 


Dfay = ¥(- (tay, Vf ay) (5-6) 


qa=0 


6 TRANSFORME PAR SYMETRIE D’UN OPERATEUR 
DE 2(G/H) 
(a) Si De B(G/H), considérons l’opérateur différentiel : 
D = S* DS*. 


Cet opérateur est invariant par G. De (4-3) il résulte en effet que pour 
yeG: 


K,*D = K,* S¥ DS¥, = S¥ K3q) DS¥, = S DK§\,) Sz, = S-. DS}. K,* 
C’est-a-dire: 
K,* D = BK,* 


82 ANDRE LICHNEROWICZ 


(b) Supposons D défini par (5-5) et cherchons 4 évaluer Df, ot 
fEeCcrV,). I vient: 


N 
DS, f = » (tia) ’ V4SzS) 
q=0 
et 
N 
Se DS}, f = y (Sx, tig) ) S¥V4S3,S) 
q=0 


La connexion canonique étant invariante par symétrie, Sz, et V? com- 
mutent et l’on a: 


N 
Df = 2 (Sie tq VP) 
a= 
Si S, est la symétrie par rapport au point x = yXo, il résulte de (4-5): 
ee = Sy fe) Kyscy-1) 
et: 
[Sx bq M(x) = [Sx' Ky sqy-) tx) = DSx'tqy] (x) 
Au point x, (S,), = —Id et nous obtenons: 
Sao Mqy = (— Iq) 
Ainsi: 
N 
Df = XL (— Dita» V"f) 
q= 


et, d’aprés (5-7), les opérateurs D et D coincident. Ona donc établi: 


D=S* DS¥, D=S*, Ds} (6-1) 


7 THEOREME DE COMMUTATIVITE 


La relation importante (6-1) entraine la commutativité de 1’algébre 
Q(G/H). En effet, soient D“) et D®) deux opérateurs de 9(G/H), D™ 
et D®) leurs adjoints respectifs. 

D’aprés (2-5): 


(D™ p?)~ = POH 
D’autre part, d’aprés (6-1): 
(D') D'))~ re Se. DYDMAS* _ Se. DS". Ss. D®S* — POP) 


Opérateurs Différentiels Invariants sur un Espace Symétrique 83 
On en déduit: 
HOHO = HOH 
soit 
S* DOD MYS* = S* DODs* 


et par suite DO) D® = D@ p@), 


Théoreme. Si G/H est un espace symétrique admettant un élément de volume 
invariant, l’algébre D(G/H) des opérateurs différentiels invariants est com- 
mutative. 

Par passage a un revétement, on en déduit qu’il y a encore commuta- 
tivité pour Z(G/H), si G/H est un espace homogéne localement symétrique 
admettant une mesure positive invariante. 


BIBLIOGRAPHIE 


1. S. Helgason, Differential geometry and symmetric spaces. Academic 
Press, New York, 1962. 

2. A. Lichnerowicz, Géometrie des groups de transformations. Dunod, 
Paris, 1958. Opérateurs différentiels invariants sur un espace symétrique. 
Compt. Rend. Acad. Sci. 257, 1963, p. 3548. 

3. A. Selberg, Harmonic analysis and discontinuous groups in symmetric 
Riemannian spaces. Ind. Math. Soc. 20, 1956, pp. 47-87. 


IV 


Hyperbolic Partial Differential 


Equations on a Manifold 


N= © UO WANA NA BRWN — 


13 
14 


Y. CHOQUET-BRUHAT 


Definitions 84 

Tensor-Valued Distributions on a Riemannian Manifold V,, 85 
Fundamental Solutions 87 

Hyperbolic Operators—Local Definitions 88 

Timelike Paths 89 

Local Properties 90 

Global Hyperbolicity 93 

Consequences of Global Hyperbolicity 95 

Boundaries 97 

Spatial Boundaries 98 

Continuity Property of ¢,* 99 

Fundamental Solutions of a Globally Hyperbolic Differential Operator 
on V,—Consequences 101 

Cauchy Problem 103 

Local Existence Theorem for the Fundamental Solution 103 


References 106 


1 DEFINITIONS 


The term V,, will denote a paracompact, oriented manifold, which will 
be supposed for simplicity of class C®”, although this assumption could be 
relaxed to some C’. C*(V,) and C,*(V,) will denote, respectively, the space 
of functions of class C* on V, and the space of these functions which have a 
compact support. 

A linear partial differential operator on V,, is a linear map of C“(V,,) 


into C*(V,), 


u— Lu 


84 


Hyperbolic Partial Differential Equations 85 


such that in every coordinate system (v, @), v domain of the coordinate 
patch, ¢ homeomorphism of v onto the open set Qc R,, Luo gd! can be 
expressed by 


P 


_ 0 7 
Luod 1 =) ape? Sa (wo b 1) (1.1) 
p<m "Ox 
where ai!" (x°) are C* functions on Q. 

A distribution u on V,, ue Z'(V,), is a continuous linear form on the 
space Co*(V,); 


Cu,~rpeR VpeCo*(V,) (1.2) 


‘‘continuous”’ meaning that for every compact Kc V, and for a finite 
covering v;, @; of K there exist positive constants C(K) and k(K) such that, 
for every gp E€ C,(V,,) with support in K 


<u, > <C(K) )| Sup |D%@ o $; "| (1.3) 
q<k(K) 
where D‘ denotes the set of all partial derivatives of order g. If, for all 
compact K, there exist p such that k(K) < p, the distribution u is said to be of 
order p. One identifies a locally summable function on V,,, fe Li,.(V,), With 
a distribution f of order 0 in setting 


Cf, 9) = f fon (1.4) 


where denotes the usual integral on V,,, 7 being an everywhere non- 
V 


vanishing, C~, n-form on V,, for instance, a Riemannian volume element: 
the set of distributions on V,, does not depend on the choice of n, though the 
identification of a particular function fe Lj,.(V,) with a distribution depends 
on this choice. 

A differential operator L on distributions on V,, is defined by 


(Lu, p> =<u,L'g> ueD(V,) pECo(V,) (1.5) 


where Lis the adjoint operator of L, defined on Cy*(V,,) by a formula anal- 
ogous to 1.5, where u and g both belong to Cy)*(V,). 
A‘: 


2 TENSOR-VALUED DISTRIBUTIONS ON A RIEMANNIAN 
MANIFOLD V, (LICHNEROWICZ [2]) 


A p-covariant tensor distribution ¢ on V, is a continuous (in a sense 
analogous to 1.3) linear form on the space of C® p-contravariant tensors @ on 


86 Y. CHOQUET-BRUHAT 


V,, with compact support.’ If fis an ordinary tensor, locally integrable, it is 
identified with a distribution by 


(>= Jf (t Mant) 


where (t, ~), denotes the classical scalar product at the point xe V,, and 
n(x) the Riemannian volume element at that point. 

The covariant derivative of a tensor distribution with respect to the 
linear connection given by the metric (supposed to be C™%) is defined by 
<Vt, o> = <t, d~) where, written in local coordinates for simplicity, 


(Sp) = —V, (pPai-ep 


V, being the classical covariant derivative. 
These definitions permit the easy writing of differential operators on V,, 
in the sense of §1, that is, on scalars, and also on tensors. 


Example 


Laplacians on V,,. The Laplace operator on scalars is 
A= —V°V, 


It is defined on exterior forms (that is, on skew-symmetric tensors) by 
(de Rham) 
A=d6+ 6d 


and has been extended to all tensors by Lichnerowicz [2]. A is a map of the 
space of C™ p-tensors of a given type into itself, expressed in local coordinates 
by 

AD) ica SVN pleat a Re lees 


__ pa 
ye eee 
k#l 


where R,g x, and R,, are the curvature and Ricci tensor of V,. The 
higher-order terms, in local coordinates, are the same for all equations; they 
are, if g,, is the Riemannian metric, 


5 Oo O 
O58 Bet 
The operators thus defined are invariant under isometries and are self-adjoint: 
(Au, p> = <u, Ag> = Vu, pe C,*(V,) 
and an analogous identity for tensors (cf. [2]). 


1 These tensor-distributions can be identified with de Rham’s currents for completely 
skew symmetric tensors. 


Hyperbolic Partial Differential Equations 87 
3 FUNDAMENTAL SOLUTIONS? 


Definition. A fundamental solution for a partial differential operator L on 
V,, is a distribution E(x, x’) on V, x V, which satisfies 


L,.. E(x, x’) = 6(x, x’) (3.1) 


where 6(x, x’) = 6(x — x’) 1s the Dirac kernel on V, x V,; defined by 
(Q, Ww E Co” (V,,)): 


B(x, x’), OMG’)? = | olOMCINC) (3.2) 


A fundamental solution is said to be regular if it is a regular kernel in the 
sense of Schwartz, that is, if the distribution 0(x) and 0’(x’) on V,, defined, 
respectively, by 


<O(x), p(x)> = (E(x, x’), p(x) W(x’) (3.3) 
<O(x’), W(x’)> = (E(x, x’), p(x)(x')> (3.4) 


can be identified with locally summable functions on V,, which will be 
denoted by 


A(x) = CE(x, x’), W(x’) (3.5) 


O'(x') = CE(x, x’), p(x) (3.6) 


E(x, x’) will be called regular of order p if 0(x) and 0’(x’) are both of class C?. 

Let us denote by D7 the space of distributions on V, of order p and 
compact support. From the definition, the following theorem [4] will easily 
follow: 


and 


Theorem. (1) A fundamental solution, regular of order p, gives a solution 
of the equation 
Lu =v ve Dy (3.7) 


by the following formula (convolution in the sense of Volterra) 


u(x’) = is E(x, x’)v(x) (3.8) 


(2) This solution is unique in @,? if L* has also a fundamental solution 
regular of order p. 


2 For simplicity, we will give the definitions for one operator on scalars, though it can 
be easily extended to operators on tensors, or systems of such operators: the fundamental 
solutions are, in that case, tensor-valued distributions on V, x V, (cf. Lichnerowicz [2]) 
Or matrices of such distributions (cf. [5}). 


88 Y. CHOQUET-BRUHAT 


PROOF. Equation (3.8) means that 


<u(x’), p(x’)> = <o(x), Ox)> =p E COV) (3.9) 
where 
O(x) = <E(x, x’), o(x’)> deC(V,) (3.10) 


By the properties of supports and differentiability u(x’) is well defined by 
(3.9). By definitions of §1, one has 


<L,- u(x’), @(x')> = Cu(x’), Le e(x')) (3.11) 
and by (3.9), 
<u(x'), Ly p(x’)> = <0(x), 6*(x)> 
with 
O*(x) = (E(x, x’), Ly o(x')> = (Ly E(x, x’), e(x’)) 
that is, by 3.1, 


O*(x) = y(x) 
Hence, u(x’) verifies, in the distribution sense, 
Lu = v 


(a) Let us denote by E*(x, x’) the fundamental solution of L*. A 
similar computation shows easily that, for every ue Di(V,): 


| E*(x, x’)L,- u(x’) = u(x) (3.12) 


from which uniqueness follows. 
These two results will be extended to u, v €e Z’*(V,) when L is hyperbolic. 


4 HYPERBOLIC OPERATORS—LOCAL DEFINITIONS 


Definition 1. The characteristic polynomial of the operator L, given by (1.1) 
in local coordinates, at a point xé V,,, is 


h() = Yams, (4.1) 


where €; is a covariant vector at x. The cone in the cotangent plane 7,* 
at x defined by 
h(¢) =0 


is called the characteristic cone. It does not depend on the choice of co- 
ordinates, because the higher-order terms of LZ transform into higher-order 
terms by a change of coordinates. 


Hyperbolic Partial Differential Equations 89 


Definition 2. The operator L is called hyperbolic at the point x if there is a 
domain I, (which is then a convex open cone) in 7,,* such that every line 
through AeéT, (except Ax) cuts the characteristic cone in m real distinct 
points. 


EXAMPLE. The second-order differential operators with higher-order terms 


0 0 


apo 
oF Ox ax? 


are hyperbolic at x if and only if the cone 
gE, fr = 0 


is convex, that is, if the quadratic form is of signature (1, n — 1). 


5 TIMELIKE PATHS 


Let us denote by C,, the closure of the dual of T’,; C, is a convex closed 
cone in the tangent space at x. 

If L is hyperbolic at every point of V,, and has differentiable coeffi- 
cients, there is defined on V, a differentiable field of cones C, in the tangent 
spaces. We will always suppose in the following that one can define globally 
on V,, (continuously) the two fields of half-cones C,* and C,~ which will be 
called, respectively, future and past cones (the manifold V, will be said to 
be time-orientable with respect to the field C,). 


Definition 1. A differentiable, oriented path on V,, t> y(t), a<t<b, is 
called timelike if at each point its positive tangent is in C,”. 

The set t of timelike paths on V,, (called t-paths) is the closure in the 
compact-open topology of the set @ of paths on V, (Bott, this volume 
Ch. XVIII) of the set above defined of differentiable t-paths. 

One can show that f-paths are rectifiables and have a tangent almost 
everywhere [3]. 

We will denote by @,* (resp. @, ) the set of t-paths originating at x 
(resp. ending at x). @,* and @,” are closed sets in @. We will denote by 
&,* (resp. &,~), and call the future of x (resp. past of x), the image in V, of 
€.* (resp. ,_). The sets &,* and &,~ are not necessarily closed sets in 
Vs 

The space V, being paracompact is metrizable: its topology can be 
induced by a positive Riemannian metric which can be chosen complete [6], 
p. 292. We will parametrize all rectifiable paths in V, proportionally to 


3 For instance, take a point off Minkowski space (example given in [9]). 


90 Y. CHOQUET-BRUHAT 


their arc length in such a metric [7]; they will then be defined by maps from 
the unit interval [0, 1] to V,. The following definition turns the set of these 
paths into a metric space, whose topology is induced by the compact-open 
topology. 


Definition 2. The distance between two paths y,, y2 is 


A(yi, ¥2) = Sup |ly,(4) — yao 
te(0,1] 


where ||x — y|| denotes the Riemannian distance between the points x and y 
in V,. 


6 LOCAL PROPERTIES 


Definition 1. The Jacobi equation of the operator L is the first-order partial 
differential equation on V,,: 


h(grad v) = 0 ve C*(V,) 


where grad v is the usual covariant vector defined by the differential of v and 
h the characteristic polynomial. 


Definition 2. The Hamilton-Jacobi equations are a system of ordinary 
differential equations on E*(V,) which are given, in a coordinate patch, by 


dx* —dé, 
—— =~ =d h(x, €) =0 
ohloe hjaxt 97 M8) 
The integral curves of these equations are the characteristic of the Jacobi 
equation; they are called the bicharacteristics of L. 


Definition 3. \t is classical to associate Lagrange equations of a variational 
problem to Hamilton equations: one shows that the bicharacteristics are a 
solution of 

d (g(x, v)) @ , ax? 


Ae al — aa %, v) =0 v oe g(x, v) = 0 


where g(x, v) is obtained* by eliminating € between A(x, €) and 


pies Oh(x, 4) 
= rz 


* Cf. J. Leray, par. 5. 


Hyperbolic Partial Differential Equations 91 


so the bicharacteristics are the extremals of 


Joc dy 


verifying g(x, dx) = 0. 
In the second-order case, we have 


h(x, ¢) = g"*(x)é, Cp 
g(x, V) = Gap(x)v*v" 


The bicharacteristics are the null geodesics corresponding to the Lorentzian 
metric. 


Local Properties 


1. For every point y € V, there exists an open nieghborhood &, (that is, 
of the path [0, 1] y), called canonical ball in @, such that the image of 
BO," (resp. BO @,) is By 0 C,* (resp. B, A C,~), where By is an open 
neighborhood of y in V, and C,* (resp. C,~) a closed, convex half-cone, the 
boundary of which is generated by bicharacteristic t-paths. Moreover, a 
point z in B, is interior to C,* if and only if there is a t-path yz which is not a 
bicharacteristic. This local property has been known for a long time in the 
second-order case [it is sufficient to take normal coordinates at y to see it 
(Fig. 1)] and results from Leray [1] in the higher-order case. 

2. It results from the property | that, for every xe V,,, there exists a 
neighborhood V(x) of x and points y and z such that 


V(x) c 6,” V(x) c &," 


FIGURE 1 


92 Y. CHOQUET-BRUHAT 


Proor. Take zx (resp. xy), timelike and not bicharacteristic, and in a 
sufficiently small neighborhood of x; x will be interior to B, a C,*, hence, 
there will exist V(x) < B, NC,* < @,”. 

3. There is a complete positive Riemannian metric on V,,, and canonical 
balls @,, such that the elliptic length of all t-paths in &, is uniformly bounded, 
for each ye V,,. 


PROOF (in the second-order case). Denote by g,, the hyperbolic metric ten- 
sor and v a strictly timelike vector field on V,, (such a field always exists on 
a hyperbolic manifold), such that 


Juagv'v® >k >0 


The metric on VJ, 


I 
ds = —g,, dx" dx? + k (v, dx")? (6.1) 
is an elliptic metric. The f-paths on V,,u— x*(u) in local coordinates, 
Uy KU < Uy, Satisfy 


dx* dx® e 
Gas du du ~ 


Their elliptic length therefore satisfies 
1 
ere 
5<7 j. 
In the neighborhood of a point we can always choose coordinates such 


that the vector v, has the components v, = I, v; = o and, on a timelike arc [6]. 
(dx°/du) >0. We deduce then from (6.2) 


ad a 
v, | du (6.2) 
du 


ix x"(u;) - x°(uo) 


which gives a uniform bound for the lengths of t-paths in the metric (6.1). 

It is known? that for every elliptic metric g on V,, there exists a conformal 
elliptic metric @g which is complete, @ being continuous and bounded in a 
neighborhood, the lengths of ft-paths in the metric ¢g will be also uniformly 
bounded. 


> If v*, u* are timelike gag v%u* > 0, if v* is strictly timelike, g., v*u® > 0. 


Hyperbolic Partial Differential Equations 93 
7 GLOBAL HYPERBOLICITY 


The principal ideas in the next paragraphs are to be found in Leray [1]. 
The presentation is somewhat different, and I have made explicit some proofs 
which were implicit. 

It is said that L is globally hyperbolic on V, if (1) it is hyperbolic at 
every point, and (2) the set of timelike paths between two points ,* 7 @,~ 
is compact, in the above defined topology, for all x and y. 


Theorem 2. A necessary and sufficient condition for the global hyperbolicity 
is that, in a complete, positive, Riemannian metric on V,,° all timelike paths 
joining x and y have a bounded length. 


PROOF. a. To prove the sufficiency, we will use the Ascoli theorem, which 
we recall here: 

Let @ be the space of all continuous maps of X and Y (metric spaces). 
A subset F of @ is compact if 

1. Fis closed in @. 

2. The closure of F[t] is compact for every t € X. 

3. F is equicontinuous. 
Here X is the closed interval J = [0, 1] and Y is the manifold V,, with its 
positive Riemannian metric. The family @ is the family of all continuous arcs, 
mappings y: [> V,; F the family of these arcs parametrized proportionally 
to arc length in V,,, which are timelike and join two given points x and y. 
We suppose that these arcs have a length bounded by a positive number K. 
Let us show that they verify the conditions of the Ascoli theorem. 

1. F is closed in @, since F=@,* 1@, , @," and @, being closed 
Sets. 

2. V, being a complete Riemannian manifold, the closure of F[r] will 
be compact if it is bounded, but it is so since for every arc y € F, andte X 


lIv(t)-— xl <K 


3. The family y € F is equicontinuous, since by the definition of the 
parametrization 


lIlv(t1) — vCta)il < KQ)(t2 — th) 
where k(y) denotes the length of the arc y, which verifies 
k(y)<K 


6 Such a metric always exists, cf: [7]. 


94 Y. CHOQUET-BRUHAT 


b. The necessary condition is the consequence of the following: all 
t-paths in a compact subset [ of @ have bounded length. 

Such a subset [ can be covered by a finite number of canonical balls 
%,, but in a canonical ball the proof results from the local property 3, §6. 

Theorem 2 enables one to construct many examples of globally hyper- 
bolic manifolds for the Laplacian operator. 


Examples 
1. V, =R", hyperbolic metric 
ds* = g,, dx" dx? 
such that all cones C,(g,,x*x" > 0) translated at the origin are inside a given 
convex cone. 

Put on V, the Euclidean metric: It is complete, and the lengths of time- 
like paths between two points is bounded (it is easily shown in using the fact 
that there are globally defined spatial hypersurfaces). 

2. V,, = Vi-4 x R 
where V,,_, is acomplete Riemannian manifold with metric (positive definite) 
do”, Put on V, the metric 


ds” = (dx°)* — do? 
The associated positive metric 
ds? = (dx°)? + do? 


is complete, and the timelike paths of V,, satisfy 
(=) s (<) dx? 9 
du} ~ \du du 


(i) <5) 
—]} <2 

du du 
and for the length between the points x and y, of time coordinate x° and y? 
on a timelike arc length (xy) < 2|y° — x?|. 


3. Let V, be a C® manifold with a hyperbolic metric, time-oriented. 
Let v be a strictly timelike vector field: 


which gives 


Javvi>k>O on, 


V,, will be globally hyperbolic if the elliptic metric (6.1) is complete and if the 
integral 


[> dx* 


x 


Hyperbolic Partial Differential Equations 95 


taken on any timelike path joining x and y is uniformly bounded, in par- 
ticular [8] if there exists a continuous function on V, such that v, = é, f (the 
hypersurfaces f= constant are then ‘‘ global Cauchy surfaces’’). 


Remark. A globally hyperbolic manifold cannot have closed time lines. We 
will show in §11 that, moreover, it satisfies the “‘ strong causality ’’ assumption 
of Hawking—Penrose. 


8 CONSEQUENCES OF GLOBAL HYPERBOLICITY 


Lemma. For any two points x, y in V, globally hyperbolic, there are neigh- 
borhoods V(x), W(y) such that the t-paths from V(x) to W(y), @y2) O Sw 
have a uniformly bounded elliptic length. 


ProorF. If we choose y* interior to &,*, ina small neighborhood of y, such 
that y is interior to &,. (which is always possible by local property 1, §6), we 
can find a neighborhood W(y) interior &). (Fig. 2). We choose, by an 


FIGURE 2 


analogous method, V(x) interior to 63.. If ze V(x) and ue W(y) are joined 

w-~ e e e e e o_o 
by a t-path Zu, this path is, by the construction included in a ¢t-path x*zuy*, 
which proves that 


—_ > a) 
length (zu) < length (z*zuy*) < K 
We deduce from the lemma (Borel—Lebesgue) that the length of all 


t-paths joining two compact sets is bounded, which proves, again by the 
Ascoli theorem, the following: 


96 Y. CHOQUET-BRUHAT 
Theorem. If P and Q are compact subsets of V,, @p’ AN @g_ is compact. 


Theorem. If V,, is globally hyperbolic, the image in V, of @p” A @g_, which 
we denote by &p" 1 &@g , is compact. 


PROOF. We will show that 6p" \ &g_ is closed and bounded in the metric 
of V,. Since V,, is complete, such a set is compact. 

1. @p* AN@g_ being compact, all t-paths from P to Q are bounded in 
length by some fixed number N. The image &p* m &g_ in V, is therefore 


contained in a bounded set. 
2. Let z, °:*z,°°* be a sequence of points converging to z in V,,, that is, 


|z —z,|| <eé ifn>N (8.1) 
denote by y; = x;z;y; a sequence of t-paths in @p* 7 Gg" ; by the compact- 


ness of this set there is a subsequence, denoted y,, converging to a f-path 
yECp' Gq , that is, 


Sup ly,(t) -—yOl<é n>N 
O<r<1 
But z belongs to the f-path y, since if z did not belong to y (compact subset 
of V,), we will have a number 7 > 0 such that 
Inf |z— yl > 7 (8.2) 


O<t<i 
but for each z,,, there exists ¢, such that 
Zn = Yn( tn) (8.3) 
From (8.1) and (8.3) results 
|Z — y(t,)| < 2e 
which contradicts (8.3). 


Corollary. &p* (resp. &g_) is a closed subset of V, if P is compact. 


ProoF. Apply the preceding theorem to &p* 6 iz) Where B is a compact 
containing the limit point of a sequence z,°::z,°*:in@p-. 


Past and Future Compact Sets 


Definition. A set P < V,, is called past-compact if the set of paths @p* nA @,~ 
is compact for all xe V,. Analogous definition for future compact sets. 


Theorem. A past (resp. future) compact set is closed. Every closed subset 
of a past compact set is past compact. 


PROOF. Analogous to the preceding ones. 


Hyperbolic Partial Differential Equations 97 


Theorem. The future &p* of a compact set is past-compact. 


PROOF. We have already proved that Gp” - @,~ is compact if P is compact 
but, by the definition itself, 


te + 
6 p = Ce ,+ 


Theorem. If Q is compact and P past-compact, @p* 1 @g_ is compact in @ 
and &p" 1 &g_ is compact in V,,. 


PROOF. Analogous to the one used in theorem 12. §7: take a small neigh- 
borhood V(x) in Q, such that there exist x* and V(x) c &3.: all t-paths from 
P to V(x) have an elliptic length bounded by the maximum of elliptic length 
of t-paths from x* to P; then use the same arguments (Fig. 3). 


FIGURE 3 


9 BOUNDARIES 


The boundary oU of a set U in V,, will be, as usual, 
0U =U nCU 


where U denotes the closure of U in V, and CU the complementary set. The 
interior of U is 


U=U-—<0dU 


The following important lemma, theorem, and corollary do not depend 
on the global hyperbolicity hypothesis. 


98 Y. CHOQUET-BRUHAT 


Lemma. If xy is a t-path which is not a bicharacteristic, there exists a neigh- 
borhood V(y) such that V(y) < &,*. 


Proor. Let us take z on Xp such that Zy is in the canonical ball @, of §6. 
Since Zy is not a bicharacteristic, there exists a neighborhood V(y) such that 
V(y) < &,*, but &,* < &,*, since z € &,*, which proves that V(y) c &,*. 


Theorem. Let P be any subset of V,, and xy a t-path such that 
xEebp', yEed&,™ 


then xy is a bicharacteristic and x € 0&p*. 


PRooF. (1) Since y€ 0&p", there cannot be an open set V(y) containing y 
such that V(y) c &p*, hence, there cannot be such a set with V(y) c &,* 
with ze xy, since &,* C @p*. We deduce therefore from the lemma that zy 
is a bicharacteristic if zy € B,. 

(2) To show, then, that Zy belongs to 0&p*, take a point z’ E7Zy. 
If z’ was interior to &p*, we could find a neighborhood V(z’) interior also 
to &p’, and in this neighbourhood (local property) a point z* such that z*z’ 
is strictly timelike. The path Z*2'y would be a t-path in &p* © B, which 
would not be a bicharacteristic, which we have seen is impossible. Thus 
z*Eed&p’. 

(3) The same argument, applied recurrently, shows that the whole 
path xy is on 0&p* 


10 SPATIAL BOUNDARIES (global hyperbolicity 
supposed) 


Definition. The spatial boundary of a past-compact set P is the subset 
S of P such that 
xESeé, AP=x 


Theorem 1. \f S is the spatial boundary of P, 
65° = Cs 
Corollary. S is not empty. 


ProoF. Let xpe@p* and &, AP#-x. There is then a point x, # x with 
x,€&, OP and therefore a t-path X,xye@p’. If x,¢S, the same 
argument gives a point x, such that x,x,;xy¢€@p~ and so on; by the hypoth- 
esis @p’ OG, is compact; therefore, the strictly increasing sequence of 
paths x, xp has a limit point z, which is in P (closed) and therefore on S. 


Hyperbolic Partial Differential Equations 99 


Theorem 2. Let us denote by ¥@ p* the part of d&p* which does not belong 
to S (spatial boundary of P). Then F&p* is generated by bicharacteristics 
issued from S (but does not in general coincide with the set of all these 
bicharacteristics). 


PROOF. &p* is a closed set, that is, O€p* < 6p’, which proves that any 
yeéd&p* is on a t-path xy, x ES, since xy is a bicharacteristic of yE 0&p*. 


Remark. One can prove that, in each point of ¥&,* which is interior to a 
bicharacteristic, F&p* admits a tangent hyperplane (the hyperplane é, 
associated with the bicharacteristic vector). 


Example 


In Minkovski space, take two half-cones &,* and &,*. The boundary 
F¥ (6° UE,"*) is part of the union of the boundaries (Fig. 4). 


FIGURE 4 


11 CONTINUITY PROPERTY OF @&)* 


Definition. If A and B are compact subsets of V, such that A c B, we call 
distance d(A, B) the following number: 


d(A, B) = Sup Inf ||x — yll 


yeB xeA 
It has the following properties: 
(1) dA, B) =O0SA=B 
(2) AC BcC-dA, C) 2 d(B, C) 
d(A, C) > dA, B) 
(3) d(A, C) < d(A, B) + dB, C) 


100 Y. CHOQUET-BRUHAT 


An e-neighborhood of a set A is any set B > A which verifies 
d(A, B)<e 


Remark: lf A is a closed ball of radius r, B 1s contained in a closed ball of 


radius r + €. 
A directed sequence of sets B; > B, > -:: > B,::: is said to converge 


to a set B if for every « there exists N such that 


d(B,B,)<e ifn>N 


Theorem.’ &p* \ Q (Q compact toward future) depends continuously on P 
in the following sense: If P,; > P,:-: > P, is a sequence of compact sets 
converging to P, then the sets €) 1 Q,..., & >, (. Q converge to the set: 
6a ti O: 


PROOF. The sequence &p, 1 Q, ..., is a decreasing sequence of compact sets: 
6p, Q>°::> 62. 0Q 


therefore it tends to a limit H, contained in each set &s, 0 Q. To prove that 
H=6&p* OQ, let us takea pointue H. Sinceue &p, there exists y, € P, such 
that J, is a t-path. By the compacity hypothesis (see §7, note that P, < P,), 
there is a subsequence, y, of this set of t-paths which converges to a t-path y, 
ending in u (Fig. 5). Choose on each y, a point z, € P,, and extract from the 
sequence z, a subsequence which converges to a point z: necessarily zéE P 1 y 
since P, converges to P, which proves thatue &,* Aép. 


S 


FIGURE 5 


7 Analogous results in changing past into future. 


Hyperbolic Partial Differential Equations 101 


This theorem and the definition of the spatial boundary S will be useful 
in the construction of fundamental solutions through the following proposi- 
tion, which results immediately from the theorem and the definition of S. 


Proposition. If x e S, where S is the spatial boundary of a set P compact 
toward the past, there exists a neighborhood V(x) such that €),.) 0 &p* is 
contained in a given ball of center x radius p (p arbitrarily). 


PROOF. Take for Q and P of the preceding theorem (and changing past and 
future) the sets here denoted &p* and {x}, and for P, a decreasing sequence 
of balls of center x, V(x) < P,, n> N. 


Remarks. (1) The preceding theorem shows that global hyperbolicity 
implies that the sets 6.) A @y) [where V(x) is any neighborhood of 
x € V,] form a basis of neighborhoods for V,,. 

(2) It also implies the “strong causality’? of Penrose and Hawking 
since, if y = J, y2yy3V4 is a t-path in 6y* A éy_, all points of y are also in 
6y* Aé6y_, and, following Remark 1, an arbitrarily small neighborhood of 
x contains such a set (Fig. 6). Under the “strong casuality’’ hypothesis, 
Penrose and Hawking have proved results related to those of §9 and §10. 


Va 


FIGURE 6 


12 FUNDAMENTAL SOLUTIONS OF A 
GLOBALLY HYPERBOLIC DIFFERENTIAL OPERATOR 
ON V,—CONSEQUENCES 


We will in the next paragraphs sketch the proof of the following. 


Theorem I. The hyperbolic partial differential operator (with C® coefficient) 
L on V, has one and only one fundamental solution on V, x V,, E*(x, x’) 


102 Y. CHOQUET-BRUHAT 


[resp. E ~ (x, x’)] which is a regular C® kernel such that for each x the support 
of E *(x, x’) [resp. E ~(x, x’)] is in &,* (resp. &,). 

The adjoint operator L* which is also hyperbolic possesses then two 
fundamental solutions, E**(x, x’) and E*” (x, x’) with the same properties. 


Theorem II. The partial differential equation® 


Lu=v veg’* (12.1) 


has one and only one solution, ue 9’*. 


PROOF [4]. (1) By an argument quite analogous to those of §3, one proves 
that one solution of (12.1) is 


u(x’) = f E* (x, x’)uo(x) (12.2) 
in using the fact that, here , 
Supp v 4 Supp @ is compact (12.3) 
since v € Y’* and Supp @ is compact toward the future, because if 
A(x) = (E*(x, x’), g(x’)> (12.4) 
Supp 6c E (Supp ¢) (12.5) 


2. An analogous argument shows that 
u(x) = | E*~(x’, x)L,u(x) = u(x) Ee '* (12.6) 
Vn 
from which follows uniqueness. 


Corollary. (a) The uniqueness of the fundamental solution E *(x, x’) [resp. 
E ~ (x, x’)] follows from the general uniqueness theorem since it verifies 
L, E*(x, x’) = 6(x, x’) (12.7) 


and belongs to 9’*, for each x. 
(b) In comparing 12.2 and 12.6, we get 


E*(x, x’) = E*~(x’, x) 
One also deduces easily from it the invariance by isometry of E*(x, x’). 


Remark: If v is a C® function, so is uw. 


Propagator. The difference E(x, x’) - E~(x, x’) has been called by 
Lichnerowicz “ propagator.” It plays a fundamental role in the quantum 


8 Q’+ is the space of distributions with support compact toward the past. 


Hyperbolic Partial Differential Equations 103 


theory of fields on a curved space-time, as it has been developed independ- 
ently by Lichnerowicz [2] and B. and C. DeWitt [11]. 

The propagator gives, by Volterra convolution, the general solution of 
the homogeneous equation (cf. [5a]). 


13 CAUCHY PROBLEM 


We look for a solution of 
Lu =v ve C*(V,) 
such that, on a given Lipschitzian hypersurface S$, compact toward the past, 
u — w vanishes m times, w being a function given in a neighborhood of S, 
satisfying 
Lw=v on S$ 
The solution, in the sense of distributions, is given by 
u(x’) = [ E*(x, x’(T(x) + v(x) 
Vn 
where 7(x) is the following distribution 
LT(x) = {Lw} — [Lw] 
{W} being the distribution defined by the function equal to W on &;* and 
to zero outside: T(x) is a distribution with support S. 
One shows easily that, if S is spatial at each point the solution uw is a 


solution of Cauchy problem in the usual sense. Results can be obtained 
when S has characteristic points, or is a characteristic manifold. 


14 LOCAL EXISTENCE THEOREM FOR THE 
FUNDAMENTAL SOLUTION 


The general existence theorem for the solution of the Cauchy problem, 
founded on a priori estimates, based on “energy” inequality and essentially 
due to Leray [1], proves the existence of a fundamental solution of hyperbolic 
linear systems of any order on a manifold. In the case of second-order 
equations, there are several methods of constructing explicitly these funda- 
mental solution.? The result is as follows: 


Theorem. If L is a hyperbolic operator on V, (with C® coefficient), every 
point xe V, has an open neighborhood K such that L has a fundamental 
solution in&@, Aé,x” 


°? For an even num ber of dimensions a convenient one, which originates from Sobolev 
uses the fact that the wave operator in flat space has a fundamental solution which support 
On the characteristic cone (cf. [5]). 


104 Y. CHOQUET-BRUHAT 


Global Existence Theorem for the Fundamental Solutions 


1. Extension of a local fundamental solution by the properties of 


support: 

Suppose we have constructed a fundamental solution in the open set 
bo” A &p* (Pand Q compact subsets of V,), E ~(x, x’) [analogous reasoning 
would hold for E *(x, x’)]. 

We define the extension of E(x, x’), which we will still denote by 
E~(x, x’), distribution on V, x V, which will be a fundamental solution in 
Dx D’ with 

D=653 UC&p* D=6s5* 


where D means open set arbitrarily near D, with closure belonging to D 


FIGURE 7 


(Fig. 7). In setting, if g, wy € Cy°(V,) 


E™ (x, x’), p(x)W(x’) = E(x, x’), P(x) W(x’), 
9 = 90,¥ =o 
where @ denotes a given C® function such that 
o=l1onés n& *,¢=00nG(6g Op") 


we have 
L,.. E~ (x, x’) = 6(x, x’) on D x D’ 


and E ~(x, x’) has, for every x, a future compact support (in &, ~). 
2. Construction of a fundamental solution on an open set not contained 


in Dx D’: 


Hyperbolic Partial Differential Equations 105 


Suppose that D is not identical to V,. The set CD =, is a closed 
subset of ¢p*. It is therefore past-compact and has a spatial boundary S. 
Take a point xe S; there exist compact neighborhoods (with nonempty 
interiors) Q, of x such that the local fundamental solution exists in 
bo, A bp, (cf. §14), where P = C(C&p* U &g_), and (65, A &3,) does not 
belong to D (since x was a point on its boundary) (Fig. 8). 


FIGURE 8 


The extension construction | gives a distribution 
E, (x, x’) on V, x V, 
which is a fundamental solution in D x D,’. 
=&~ OCé, D,' =, 
L, E,~(x, x’) = 6(x, x’) in B, x B,’ 

Remark. We have D, 4 D, D’ — D,' = D 

3. Piecing together of the two fundamental solutions: 

We will construct a distribution E E~ (x, x’) on V, x V,, which will be a 


fundamental solution in (D U D,) x D’. 
Define E ~(x, x’) by the Volterra convolution: 


E-(x, x’) = [ E~(x", x’)k(x”, x) 


where we set on V, x JV, 


K(k", k) =F {d(k", k) =< Ly E, ~(x, xf 


106 Y. CHOQUET-BRUHAT 


where fis a C® function such that 
f=0on C(D UD,) 


f=l1onD uD, 


We note that by 
6(x”, x) — L, E,~ (x, x”) =0on D, x D,’ 


and get, since CD,” U D, < D 

L,. E(x, x’) = k(x’, x) 
The distribution on V, x V,, 

E, 7 (x, x’) + E~(x, x’) 


is a fundamental solution in DOD x BD’ with future compact support 
for all x. 

4. The preceding construction applied recurrently gives a fundamental 
solution in a strictly increasing sequence of sets, which enables one to prove 
the existence of a fundamental solution on V, x &p, with the required 
property of support, when P is an arbitrary past-compact set. 


REFERENCES 


1. J. Leray, Hyperbolic Partial Differential Equations, Mimeographed notes, 
Princeton, 1952. 

2. A. Lichnerowicz, Propagateurs et commutateurs en relativité générale, 
Publications I.H.E.S., No. 10, 1961. 

3. A. Marchaud, Compositio matematica, Vol. 3, 1936, pp. 89-127. 

4. Y. Bruhat, Equations ultrahyperboliques a coefficients variables, Colloque 
Equations aux dériveés partielles, C.N.R.S., 1956. 

5. Y. Bruhat, In: Gravitation, an introduction to current research, edited by 
L. Witten; J. Wiley, 1962. 

Sa. Y. Bruhat, ‘* Sur la théorie des propagateurs,”’ Annali di matematica LXIV 
(1964), series IV, pp. 191-228. 

6. S. Kobayashi and K. Nomizu, Foundations of Differential Geometry, 
Interscience. 

7. R. Bott, in this volume, Chapter XVIII. 

8. A. Lichnerowicz, Topics on Space-Times, in this volume, Chapter V. 

9. R. Penrose, An Analysis of the Structure of Space-Time Mimeographed 
Notes, Princeton, 1966. 

10. S. Hawking, ‘“‘ The Occurrence of Singularities in cosmology’ I, II, If, 
Proc. Roy, Soc. A. 294, 577 (1966); A295, 490 (1966). 

11. B. DeWitt, ‘Quantum Theory” in Relativity, Group and Topology, Les 
Houches, 1964, edited by B. DeWitt and C. DeWitt, Gordon and Breach. 


V 


Topics on Space- Times 


ANDRE LICHNEROWICZ 


I Space-Time and Orientations 107 
1 Space-Time 107 
2 Lorentz Group and Orientations 108 
3 Results Concerning the Existence of Lorentz Metrics and Time- 
Orientations 109 
II Differentiable Structure and Field Equations 110 
4 Cauchy Problem for the Einstein Equations 110 
III Completeness and Global Hyperbolic Manifolds 112 
5 The Mapping pz for the Arcs 112 
6 Completeness of a Space-Time 112 
7 Globally Hyperbolic Manifold 113 
8 Homotopy 114 
IV Spin-Structure 114 
9 Notion of Spin-Structure 114 
10 Existence of a Spin-Structure 15 
References 116 


In this lecture, my primary purpose is to promote fruitful discussions 
between mathematicians and physicists who have an interest in relativity. 
I will give a survey of notions and questions rather than particular results—and 
my hope is that you will contribute something to the difficult construction of a 
good axiomatic approach to general relativity. 


I SPACE-TIME AND ORIENTATIONS 


1 Space-Time 


(a) We consider here only paracompact differentiable manifolds. A 
manifold is called paracompact if any open covering of the manifold admits a 
refinement which is a locally finite open covering of the manifold. 


107 


108 ANDRE LICHNEROWICZ 


A space-time is a connected (paracompact) differentiable manifold V, 
of dimension 4, with a Lorentz “‘ metric.”’ A Lorentz metric ds? is defined by 
a differentiable, symmetric, covariant 2-tensor g of signature +—-——-—. If 
T,, is the vector space tangent in x to V,, the Lorentz metric defines, by the 
equation ds? = 0, a convex cone C, in T,. We have thus on V, a field of cones. 
A direction is timelike if ds* > 0, spacelike if ds? <0. The affine space 7, 
admits a structure of Minkowski space. 

For the moment, we assume that the differentiable structure of V4, is 
(C2, C* by pieces): in the intersection of the domains of two systems of local 
coordinates, the coordinates of a point x in the first system are functions of 
class (C*, C* by pieces), with a nonvanishing Jacobian, of the coordinates of 
the same point x in the second system. We can assume now that the metric g 
is (C!, C3 by pieces): in admissible coordinates 


ds? =g,,dx*dx* (a, B =0, 1, 2, 3) 


where the components g,, are functions of class (C’, C* by pieces). 
(b) The main element for the physics 1s the principal fiber bundle E(V,) 
defined by the set of the orthonormalized frames y = {/,, /,}(A = 1, 2, 3), with 


lls = Nap Nap = Ofor a # B Noo = 1 Naa = —1 


The structural group of E(V,) is the complete Lorentz group L(4). With 
respect to the frames yeE(V,), the metric can be written on an open neighbor- 
hood of V,: 


ds? ae Jap 9768 ms (0°)7_ y (04)? 
A 


where the 6* are pfaffian forms and where g,5 = Nag 


2 Lorentz Group and Orientations 


(a) It is well known that the Lorentz group has four connected com- 
ponents. If A = (A2’)eL(4), we will study which component A belongs to. 
We call the total signature p, of A a number equal to +1, according to the 
sign of det A. The time-signature p, of A is +1 according to the sign of 
AS’. The space-signature of A,o,, is the product €,p9,. We obtain thus 
the four components: 


L,(4):&,=1 Pa=1 L,(4):@4, = —-1 Pa=1 
L,(4):&, = —1 pa! L3(4):&,=1 Pa= —1 
(b) It is now necessary to study the number of connected components of 


E(V,). This question is strictly connected with the existence of the following 
geometrical objects: 


Topics on Space-Times 109 


An orientation—or total orientation—of V, is a pseudoscalar eé of 
square 1. Ifthe manifold V, is orientable, that is, if such a geometrical object 
exists, it is defined by one component ¢, = +1 such that, if y = y’A(AeL(4): 


by = by Ey 


It is clear that V, has at most two total orientations, € and —e. 
We chose an orientation ¢: Let us consider a covering of V, by neigh- 
borhoods provided with orthonormalized frames. The set of the local forms 


60° vn 6! an 67 a 


defines onto V, a global 4-form, the volume element of V, for the considered 
orientation. 

(c) A time-orientation p is defined onto V,, with respect to the frames 
yeE(V,), by one component p, = +1 such that if y = y’A, we have 


Py = Py’ Pa 


V, has at most two time-orientations, p and — p. 

We choose a time-orientation p. A vector /,(J)7 = 1) in x is oriented 
toward the future (resp. the past) if the component of p with respect to the 
orthonormalized frames (/,,/4) is 1 (resp. —1). The time-orientation p 
allows one to distinguish onto V,, in a continuous way, the half-cones of C,: 
future half-cone C,.* and past half-cone C,~. Ifa manifold is time-orientable, 
we have two fields of half-cones. The data ¢ and p define on V4, a spatial 
orientation o = &p. These various orientations play a part in the definition 
of P and 7. 


3 Results Concerning the Existence of Lorentz Metrics and 
Time-Orientations 


Now, I will recall the main results concerning the situation. 

(a) Every (paracompact) differentiable manifold admits a Riemannian 
metric. 

(b) If we consider the timelike eigen-direction of a Lorentz metric, with 
respect to a Riemannian metric, we define a direction field onto the manifold. 
Conversely, if a manifold admits a direction field, it is possible to construct a 
Lorentz metric such that these directions are timelike with respect to the 
metric. We obtain: 


(i) A differentiable manifold admits a Lorentz metric if and only if it 
admits a direction field and these directions can be choosen timelike. 

(ii) Each space-time admits systems of time-lines. If the space-time is 
time-orientable, it admits systems of oriented time-lines. 


110 ANDRE LICHNEROWICZ 


(c) Every noncompact manifold admits oriented direction field and thus 
Lorentz metrics with time-orientations. 

(d) A compact manifold admits Lorentz metrics if and only if the Euler- 
Poincaré characteristic y is zero. If such is the case, there exists oriented 
direction fields and thus Lorentz metrics with time-orientations [6]. 

(e) But conversely it is known that a given orientable space-time V, can 
be non-time-orientable [2]. 

We are thus led to ask some questions: 

(1) Must a space-time be orientable, or time-orientable, or both? It 
appears perhaps that the existence of a time-orientation is physically necessary. 
I assume this existence in the following section. 

(2) We have seen that every space-time admits global systems of time- 
lines. A spacelike section of V, is a regular spacelike hypersurface & such that 
every time-line of a global system of time-lines cuts 2 in one point and only 
one. But, in general, a space-time does not admit space-like sections. Is 
their existence physically necessary ? 

(3) For some theorems, the existence of a fibration of V, by a system 
of time-lines is a convenient assumption. However, in many cases, physicists 
and cosmologists assume, explicitly or implicitly, that V, is a product V; x R. 
Is not this assumption generally too strong? 


II DIFFERENTIABLE STRUCTURE AND FIELD EQUATIONS 


4 Cauchy Problem for the Einstein Equations 
(a) Exactly as in the Riemannian case, the Lorentz metric of V, defines 
a linear connection w, without torsion, such that 
Vg =0 
where V is the operator of covariant derivation in this connection w. 
Let R,, be the Ricci tensor of the Lorentz metric, S,, the Einstein tensor 


of V, defined by 


Equation (4.1) follows from the Bianchi identities for the curvature tensor. 
In general relativity, ds? satisfies the Einstein equations: 
Sap = XTag (4.2) 


where 7, is the energy tensor corresponding to some energy distribution. In 
a domain which is empty, T,, = 0 and we have the so-called exterior system 


Sap = 0 or Ras — 0 


(b) In an open neighborhood Q of V,, let us consider a local space-like 
hypersurface £ defined by the equation ¢@ = 0 (with @ of class C?, C* by 


Topics on Space-Times 111 


pieces). We adopt local coordinates such that x° = @ in Q; then we have 
00 
g’ > 9. 
We consider, for simplicitly, the exterior Einstein system. Cauchy data 
onto 2 are the “‘ potentials’’ g,, and their first derivatives Cy g,,. The Einstein 
system is equivalent to the set of the two systems: 


Rap= 29" O9ap+Pap=9 (A, B= 1, 2, 3) (4.3) 
So” = Wo" = 0 (4.4) 


where, onto X, the ®’s and w’s depend only on the Cauchy data and their first 
derivatives along &. According to (4.1), every solution of (4.3) satisfying 
(4.4) onto 2, everywhere satisfies this system. 

(c) g® being #0, the system (4.3) gives the values on 2 of the second 
derivatives 0,,94p and these derivatives are continuous across x. In the 
system (4.3), (4.4) the second derivatives 0,,9,,. are absent (“‘ nonsignificative ”’ 
derivatives). 

We can analyze this phenomenon: it is possible to change in Q the local 
coordinates, according to the formulas 
a’ (x a ar A B 
x* = x* + — {W*(x*) + &(x")} 
where e* > 0 when x° +0. The values of a point x of Z and the values of the 
Cauchy data in x are invariant by this change. It is the same thing for the 
values of the “significative” derivatives 0,,943- But it is possible, by the 
choice of the w*, to give arbitrary values to the “ nonsignificative’’ derivatives 
O.09ao: Our differentiable structure is compatible with the choice of two 
different systems of W" for both sides of &; we can thus create or destroy dis- 
continuities of the nonsignificative derivatives, discontinuities which are then 
physically meaningless. Thus, the assumed differentiable structure is strictly 
connected with the covariance of the field equations. 

(d) Itis possible to obtain discontinuities of the significative derivatives— 
or, equivalently, of the curvature tensor—if and only if g°’ = 0. We see that 
the characteristics of the Einstein equations—or, in another terminology, the 
gravitational waves—are the hypersurfaces tangent in each point x to the 
main cone C,. 

It is well known that we can determine precisely the differentiable 
structure of V, in a C"-structure, where / is sufficiently large, and it is not 
absurd to consider Lorentz metrics which are (C!, C"~! by pieces). But it 
is necessary that we know that this precision is physically meaningless. The 
introduction of new changes of coordinates which are (C2, C* by pieces) 
preserves all the physical properties (discontinuities of the curvature tensor 
for instance) of the space-time. 

(e) If we consider gravitational shock waves, that is to say, cases where 
g is only (C°, C! by pieces), Mrs. Choquet-Bruhat and Taub have proved that 


112 ANDRE LICHNEROWICZ 


the situation is similar to the one which concerns ordinary waves: it is remark- 
able that the significative derivatives 0,94, can be discontinuous only across 
the characteristics X, these discontinuities satisfying four convenient conditions 
of continuity. 


III COMPLETENESS AND GLOBAL HYPERBOLIC 
MANIFOLDS 


5 The Mapping - for the Arcs 


(a) Let V, be a space-time; it admits the canonical connection w. If 
y:te[0, 1] > x(t) € V, is a differentiable oriented arc issued from a point a of 
V,, we will define a mapping p of y into the Minkowski space T, tangent in 
ato V, (development). 

Let us consider an arc TI: te [0, 1] > y(t) of the fiber bundle E(V,) such 
that pI! = y. We note (x, y) a frame of 7,, where x is the origin of the frame 
and where y is a base (e,) of 7, considered as a matrix. We introduce the one- 
parameter sequence of frames of T, satisfying the differential system 


oy 70(=) dj = jo(2) (5.1) 


with the initial conditions x(0) = a, y(0). Here @ = (6%) and w = (a,"). 

It is easy to see that the arc j of T, defined by ¢ € [0, 1] > x(t) € T, depends 
only on the given arc y of V, and not on the arc [ of the fiber bundle. 

(b) By definition, the arc } 1s the image of the arc y by the development y. 
The arc y is geodesic if and only if ) = p(y) 1s a line segment. 


6 Completeness of a Space-Time 


(a) An oriented @°’ arc y of a space-time is called weakly timelike if the 
positive half-tangent to y at each point x is timelike or isotropic, but into or 
onto @,*. If we compose such arcs, we obtain weakly timelike piecewise @ ' 
arcs (pW @_! arcs). | 

(b) A space-time V, is called complete if for all V, and for all oriented 
arc y of the Minkowski space tangent in a, there exists an arc y issued from a, 
such that u(y) = y. The space-time is called t-complete if the property is true 
only for the arcs y which are weakly timelike. 

A space-time V, is called geodesically complete if for all ae V, and for 
all oriented segment 7 of the Minkowski space tangent in a, there exists a 
geodesic arc y issued from a, such that p(y) = y. It is equivalent to say that 
each geodesic arc of V, can be extended to arbitrary values of the affine 


Topics on Space-Times 113 


parameter. The space-time is called geodesically t-complete if the property 
is true only for the oriented segments } which are timelike or isotropic. 
We have trivially: 


complete———>geod. complete 


l l 


t-complete —— geod. t-complete 


I do not know nontrivial connections among these four notions. 


7 Globally Hyperbolic Manifold 


(a) We introduce now the closure Z of the set of the weakly timelike 
p.w.@ _' arcs, for the compact-open topology on the space of the continuous 
arcs of V,. Each element y of 7 is called now a weakly timelike arc. Such 
an arc is rectifiable for an arbitrary Riemannian metric of V,. 

If K is a set of V,, let A*(K) be the set of the weakly timelike arcs 
starting at the points x of K in the future of x. The future E*(K) is the set of 
the points of V, which correspond to the arcs of A*(K). We have similar 
definitions for A” (K) and for the past E-(K) of K. The emission E(K) of K 
is the union of the future E*(K) and of the past E “(K). 

We denote 0E*(K), OE (K), 0E(K) the boundaries of E*(K), E~(K), 
E-(K). It follows from results of Leray and Mrs. Choquet-Bruhat that if 
x € 0OE*(K) and ye E*(K), each weakly timelike arc y, joining x and y is a 
null-geodesic arc and that all the points of y, are contained in 0E*(K). 

Then definitions and results are available for K = {z}; 0E(z) = T, is, by 
definition, the characteristic conoid corresponding to ze V,. In the neigh- 
borhood of z, it is generated by null-geodesic arcs issued from z. But I, is 
globally distinct from the set of the points of V, defined by the null-geodesics 
issued from z. 

(b) According to Leray and Mrs. Choquet-Bruhat, a space-time V, is 
called globally hyperbolic if for all x, ye V4, the set of arcs A*(x) A A(y) 
is compact in Z. If this condition is satisfied, the set of points of V,, defined 
by E*(x) nm E~(y) is compact in V,. This condition implies that a timelike 
arc cannot be closed. We assume now the global hyperbolicity. 

A set K is called compact toward the past if, for all xe V,, the 
set A*(K) A(x) is compact in J ; E*(K) and all closed subsets of E*(K) 
are then also compact toward the past. The set of points of V, defined by 
E*(K) © E “(x) is compact in V,. 

Leray and Mrs. Choquet-Bruhat have proved the important following 
lemma: if K is compact toward the past and if K is compact, A*(K) A A~(K’) 
is compact in J. 


114 ANDRE LICHNEROWICZ 


(c) The interest of the assumption of global hyperbolicity is to ensure 
the existence of elementary solutions for the corresponding /inear hyperbolic 
differential operators. If T is a p-tensor of V4, we set 


(AT) q, a, = —V°V, Ta, «a, 


Let C be a field of linear operators on the p-tensors and B a field of linear 
mappings on the (p+ 1)-tensors into the p-tensors. Let us consider, for 
instance, the linear hyperbolic differential operator 


LT = AT +B°V,T+CT 


Under the assumption of global hyperbolicity, there exists for L two elementary 
kernels E'?)*(x, x’), that is, two bi-p-tensor distributions satisfying 


L,*E *(x, x’) = 5 (x, x’) 


and which have, for each x’, their supports, respectively, in E*(x') and E~(x’); 
L* is here the adjoint operator and 6°”) the Dirac bi-p-tensor defined by 


C(x, x’), T(x’) = T(x) 


8 Homotopy 


(a) If y is a weakly timelike arc joining x and y, it defines a class of 
homotopy C(x, y)—a set of the weakly timelike arcs joining x and y and which 
are homotopic to y; C(x, y) is a closed subset of A*(x) A A™(y). For given 
x and y, the set of the classes C(x, y) is enumerable. 

For a globally hyperbolic manifold, each class C(x, y) is compact and, 
for given x, y, the number of the classes C(x, y) is finite. 

(b) It is perhaps interesting to study hyperbolic manifolds V, satisfying 
the following assumption: V, is called weakly globally hyperbolic if each class 
C(x, y) of homotopy defined by a weakly timelike arc is compact in J. 

Let y, be a strictly timelike geodesic arc sufficiently small joining x and y. 
It is easy to see that the length of y, is always greater than or equal to the length 
of an arbitrary weakly timelike arc joining x and y. Otherwise, the length is 
an upper semicontinuous function on 7. 

It follows that, if V, is weakly globally hyperbolic, each class of homotopy 
C(x, y) contains a geodesic arc which gives the maximum of the lengths of the 
arcs of the class. In particular, two points of V, joined by a timelike arc can 
be joined by a timelike geodesic arc. 


IV SPIN-STRUCTURE 


9 Notion of Spin-Structure 


(a) Let V, be a space-time admitting a total orientation ¢ and a time- 
orientation p. We denote E,(V,) the principal fiber bundle on V, defined by 


Topics on Space-Times 115 


the orthonormalized frames y such that e, = 1, 9, = 1. The structural group 
of E,(V4) is the connected Lorentz group L,(V,). If y = {/,}, y’ = {/,'} are 
two elements of E,(V,), we have /, = /,'A?’ or, in terms of matrices, 

y=yA 
where AéeL,(4). 

(b) Let (Spin,(4), p) be the universal covering group of L,(4). The 
kernel N of p : Spin,(4) > L,(4) is isomorphic to Z,. 

We say that V, admits a spin-structure if there exists on V, a principal 
fiber bundle S,(4), with Spin,(4) as structural group, which is a 2-covering of 
E,(V,), such that the restriction of the projection to the fibers of S,(V,) is a 
2-covering of the fibers of E,(V,), the following diagram being commutative: 


Spiny(4)—— So(V4) 


L,.(4) ——> E,(Va) 


We will say that S,)(V,) is deduced from E,(V,) by extension of the structural 
group. 


10 Existence of a Spin-Structure 


(a) In the case of a Riemmannian manifold, we have a similar definition 
of the notion of spin-structure. Haefliger has proved that an orientable 
compact Riemannian manifold admits a spin-structure if and only if the 
second Whitney characteristic class of the manifold is null. 

I will sketch a simple proof of the theorem of Haefliger in the present 
context. Let us consider the exact sequence of groups 


O—> Z, —> Spin, (4) + L,(4) — 0 


The sequence is exact, that is to say, the kernel of each mapping coincides 
with the image by the previous mapping: here the inclusion map / of Z, in 
Spin,(4) is an isomorphism onto N; p admits the image N as kernel and 
L,(4) is the image of Sping(4) by p. 

It is well known that such a sequence induces an exact sequence of 
sheafs of local cross sections. With usual notations, we have 

0—> Z, —> Spin, (4) L, (4) — 0 
ao” “ore 


It follows the exact cohomology sequence 


+ 4 H"(Vq, Spitto(4)) "+ H'(Ve, Lo(4)) "> HV 22) —> * 


where 5* corresponds to the coboundary operator; H’(V,, Spin,(4)) is 
isomorphic to the set of the equivalence classes of the principal fiber bundles 


116 ANDRE LICHNEROWICZ 


on V,, with Sping(4) as the structural group. Likewise, H'(V,, Lo(4)) is 
isomorphic to the set of the equivalence classes of the principal fiber bundles 
on V,, with L,(4) as structural group. 

Let Ee H'(V,, L,(4)) be the element corresponding to the given fiber 
bundle E,(V,). There exists a corresponding spin-structure if and only if 
there exists an element S € H'(V,, Sping(4)) such that 


E= p*S 
The sequence being exact, we see that it is the case if and only if E belongs 
to the kernel of 6*: 


O*E=0 


(b) In the case of an orientable compact Riemannian manifold V,,, we 
substitute SO(n) to L,(4) and 6*E defines the second Whitney class C,(V4). 
We obtain thus the result of Haefliger. 

In the case of a compact space-time with a total orientation and a 
time-orientation, 6*E still defines a characteristic class which can be iden- 
tified with the second Whitney class. 

In the case of a noncompact space-time the situation is the same, except 
for the supports. More complicated results are obtained in the nonorientable 
cases (Bott). 


REFERENCES 


1. A. Avez, Ann. Inst. Fourier (1963) p. 105-190. 

2. Y. Choquet-Bruhat, Séminaire de Physique mathématique, Collége de 
France Mimeogr. 1963. 

3. Leray, Hyperbolic Differential Systems, Mimeogr. Notes, Institute of 
Advanced Study, Princeton. 

4. A. Lichnerowicz, Propagateurs et commutateurs en relativité générale, 
Publ. Math. Inst. Hautes Etudes No. 10, Paris, 1961. Théories relativistes de la 
gravitation, Masson, Paris, 1954. 

5. L. Markus, Ann. Math 62, p. 411-417 (1955). 

6. N. Steenrod, Topology of fiber Bundles, Princeton Univ. Press, 1951. 


VI 


Relativistic Fluids 
in Cosmology’ 


CHARLES W. MISNER 


The present state of the universe seems to correspond to a pressureless 
dust of galaxies or galactic clusters which is a rather trivial fluid. However, 
there are reasons to think that in the past much more interesting fluids 
(and nonfluids) were significant. 

The most crucial point of theory is the set of “singularity theorems ”’ 
of Penrose and Hawking. The crucial observations are the Hubble expan- 
sion and the recent observations of the 3°K microwave radiation which 
apparently fills the universe. 

Following up Penrose’s brilliant insight (1965, Phys. Rev. Letters 14, 
57), Hawking has formulated and proven several theorems (1966, Proc. Roy. 
Soc. A295, 490, also 1967, Proc. Roy. Soc. in press) to the effect that all 
solutions of the Einstein equations which have a gross resemblance to the 
currently popular cosmological models have a singularity some finite time 
in the past. For example, one such theorem states that the following 
hypotheses are inconsistent: (a) M is a C® pseudo-Riemannian 4-manifold 
of Lorentz signature, (b) M is timelike complete so all timelike geodesics can 
be continued to infinite proper length, (c) the Ricci tensor satisfies R,, u“u’ > 0 
for all timelike vectors u“, (d) M contains a compact spacelike hypersurface ~, 
(e) the normal n* to © satisfies n"., >0 everywhere on &. The Einstein 


1 No text of the lecture actually presented at the Battelle Rencontres, ‘‘ Mathematical 
Problems in Chaotic Cosmology,’’ was supplied. The following is the text of a talk given 
in Paris at the College de France on 19 June 1967 at ‘‘ Colloque international du Centre 
National de la Recherche Scientifique, ‘ Fluides et Champ Gravitationnel en Relativité 
Générale,’’? which covered some of the same ideas, and is reproduced here with the kind 
permission of the C.N.R.S. 


Supported in part by NASA Grant NSG436 and the U.K. Science Research 
Council. 


117 


118 CHARLES W. MISNER 


equations figure only in the weak hypothesis (c), while (d) and (e) provide 
the tie-in to some ideas of an expanding universe. The theorems of which 
this is a sample appear to me to be the most important development on the 
theoretical side of gravity physics since Einstein formulated general relativity. 
They are rigorous global theorems about solutions of the full nonlinear 
Einstein equations which state unexpected results of great physical interest. 
Although the theorem quoted above allows one to conclude that solutions of 
the Einstein equations for fluids satisfying « + 3p > 0 determined by Cauchy 
data on an everywhere expanding closed initial hypersurface contain in- 
complete timelike geodesics, most of the theorems are not so specific about 
the nature of the singularity, and would allow for some failure of causality 
as an alternative to incompleteness. But even incompleteness does not imply 
infinite densities or infinite curvature as one knows from examples (Misner 
and Taub, J.E.7.P. 1968), so the nature of the singularity one must expect 
remains relatively unknown and is an important theoretical problem of 
current interest. Some rough categories for possible types of singularities are 


(a) causality failure, 
(b) isolated or “‘ almost hidden ”’ singularity, 
(c) total singularity. 


The presence of closed timelike curves might characterize the first type, 
and I would guess that this sort of singularity is very unstable and is easily 
converted by small perturbations into one involving infinite curvature. 
An example of the second type of singularity, where almost all geodesics are 
complete, has been given by Carter (1967, to be published). Because 
nothing is known about the nature of the typical singularity, and because it 
leads to interesting consequences, I will pursue the assumption that the typical 
singularity is “‘total”’ in that all the matter in the universe is processed by an 
initial singularity involving infinite density and infinite irregularity. This 
provides a starting point for a theory of “‘ Chaotic Cosmology.” 

The main theoretical problems for Chaotic Cosmology are as follows: 


(i) To formulate statistical methods of describing the initial data near 
the singularity. 
(ii) To understand the nature of “horizons” in irregular expanding 
universes. 
(11) To “ prove the cosmological principle.” 


The first point here is just another statement of the problem of under- 
standing the nature of the singularity which the Einstein equations require, 
but a statement prejudiced toward an answer in language and content related 
to theories of turbulence. The second point emphasizes that it may take 
considerable time after the singularity before two different pieces of matter 
have time to influence each other through causal signals, so the evolution of 


Relativistic Fluids in Cosmology 119 


disturbances on scales larger than the current horizon will be very different 
from that on smaller scales. The third point is a challenge to see whether 
enough well-founded physical processes can be invoked by the theory to allow 
the interactions (which take place within the horizon at any time) to modify 
the initially singular chaos in a way which will result in that homogeneity 
and isotropy of space at the present time which has been the starting assump- 
tion for Robertson—Walker cosmological models. At very high temperatures 
neutrinos are very effective in producing homogeneity and isotropy (Misner 
1967, Nature 214, 40), but on a very small scale involving only a fraction of a 
solar mass’ worth of baryons. At lower temperatures when the horizons 
are larger, I expect shock waves to form from the large but smooth density 
and velocity gradients which would be noticed as the horizon expanded to 
allow one region to interact with a neighboring region which developed from 
different random initial conditions. Taub (Phys. Rev. 74 (1948) 328) has 
given an analytic description of the development of a large amplitude sound 
wave into a shock in special relativity, and it would be interesting to see the 
corresponding solution in general relativity as a modification of the k = 0 
radiation-filled Robertson—Walker universe. These shocks could be effective 
in dissipating irregularities as long as the universe remained radiation- 
dominated and while photon mean-free paths remained too short to make 
simple viscosity very effective for dissipating large scale irregularities. 

Nonfluids also play an important role in Chaotic Cosmology. Below 
about 10'° °K neutrinos are collisionless so no mechanism exists to ensure 
that their momentum distribution, and hence their pressure, is isotropic. 
If one assumes homogeneous collisionless radiation then its properties are 
relatively simple and similar to those of an elastic solid. Using this model, 
and a viscous fluid at higher temperatures, I have been able to show that 
arbitrary initial anisotropy in a homogeneous expanding universe will be 
reduced to a low level, below the stringent observational 0.2% limit in the 
case of the models considered (Misner, Ap. J., Feb. 1968). This is an 
encouraging success for Chaotic Cosmology since | believe no other explana- 
tion for the isotropy of the blackbody radiation has been proposed. But it 
would be very important to be able to consider inhomogeneous collisionless 
radiation as well. This is a relatively simple, ideal medium, and since 
collisionless neutrinos probably account for from 45% to 65% of the mass 
density in the universe between 10'° °K and about 10° °K, considerable 
effort toward understanding its properties and behavior would be justified. 
Note also that because collisionless radiation can support transverse shear 
stresses, it interacts directly with gravitational radiation in a way which fluids 
Cannot. It is possible, for instance, that gravitational waves propagating 
through collisionless radiation would travel at speeds less than light, just as 
do electromagnetic waves traveling through a plasma or other polarizable 
medium, 


120 CHARLES W. MISNER 


REFERENCES 


1. C. W. Misner, ‘“‘ The Isotropy of the Universe,” Astrophys. J., Feb. 1968. 

2. R. W. Michie, ‘‘On the Growth of Condensations in an Expanding 
Universe, “Astrophys. J., 1968 (to be published). 

3. S. W. Hawking, ‘“‘The Occurrence of Singularities in Cosmology, III,” 
Proc. Roy. Soc., A300, No. 1461, Aug. 1967. 

4. C. W. Misner, Nature 214 (1967) 40-51. 

5. C. W. Misner, Phys. Rev. Lett. 19 (1967) 533-S. 

6. J. Silk, Nature 215 (1967) 1155-6. 

7. E. R. Harrison, Phys. Rev. Lett. 18 (1967) 1011-3. 

8. A. G. Doroshkevich, Ya. B. Zel’dovich, and I. D. Novikov, ZAETF Pis’ma 
(JETP Letters) 5 (1967) 119-121 (96-98). 


Vil 


ok 


Structure of Space-Time 


R. PENROSE 


Introduction 12] 

The Nature of General Relativity 125 
The Abstract Index Notation 135 
Space-Times with Spinor Structure 14] 
The Interpretation of a Spin-Vector 150 
Explicit Curvature Formulas 156 
Einstein’s Equations and Focusing 160 
Conformal Infinity 17] 

Horizons 186 

10 Gravitational Collapse 202 

11 Singularities in Cosmology 223 
References 232 


OMAADANANA PhwWN = 


1 INTRODUCTION 


According to present-day theory, all the phenomena of physics take 
place within the framework of a certain differentiable manifold referred to as 
the space-time continuum. Our familiarity with this idea is such that it is 
now regarded as almost “obvious” that space and time should constitute 
such a structure. However, before discussing the nature of this structure, 
it is worth examining something of what lies behind this belief. Indeed, 
there is the definite possibility that some future theory may be found which 
describes nature more accurately than present theory, but for which the 
differentiable manifold picture of space-time would not be appropriate. We 
should not close our minds to such a possibility, but also we should keep in 
mind the extraordinary range over which the present-day view is such an 
excellent approximation. 


* Section 10 (and also some small portions of other sections, notably 2 and 9) is based 
largely on the author’s 1966 Adams Prize essay An Analysis of the Structure of Space-Time. 


121 


122 R. PENROSE 


The very accurately ‘locally Euclidean” nature of space, and the 
continuity of time, would, indeed, seem to have supplied the prime motiva- 
tion, in the first instance, for the rigorous development of the continuum 
concept. At the time of Zeno, no such rigorous concept of continuum 
existed, so that the idea of a limit, 1n space or time, seemed puzzling. It does 
not seem puzzling to us today, but perhaps we are wrong not to be puzzled! 
The standard resolution of Zeno’s paradoxes refers more to the mathematical 
Continuum concept than to the nature of space-time itself. The view of 
Space-time as forming a continuum would imply that a continuous nature 
would persist, no matter how much a system Is magnified. But it is not at all 
clear that continuous descriptions are really appropriate on a scale small 
enough that quantum phenomena become important. For example, at a 
scale of 10~'* cm (approximately the radius of an elementary particle), the 
mere attempt at localization of the position of a particle to that accuracy will, 
as a consequence of the uncertainty principle, imply the probable occurrence 
of a very large momentum, with the implication that new particles are created, 
some of which may be indistinguishable from the original particle. Thus 
the concept of “‘ position”? for the original particle becomes obscured [22]. 
More alarming, moreover, is the picture presented if we allow ourselves to 
discuss phenomena at a dimension of the order of 10°°* cm. At such a 
dimension, the quantum fluctuations in the curvature of space-time (if both 
present-day quantum theory and gravitation theory can be accurately ex- 
trapolated to this degree) would be large enough to produce alterations in 
topology. Thus, the view of space-time at this dimension would be some 
kind of chaotic linear superposition of different topologies [112]—a picture 
in no way resembling a smooth manifold. 

Whether or not it is meaningful to talk about the nature of space-time 
at such dimensions is not at all clear. But it if is not meaningful, then we 
certainly cannot refer to space-time as accurately constituting a smooth 
manifold. On the other hand, it may be argued that the smooth manifold 
picture 1s adequate for the discussion of all relevant physical processes. It is 
my personal view that this cannot ultimately be the case. I do not believe 
that a real understanding of the nature of elementary particles can ever be 
achieved without a simultaneous deeper understanding of the nature of 
space-time itself. But if we are concerned with a level of phenomena for 
which such an understanding is not necessary—and this will cover almost all 
of present-day physics—then the smooth manifold picture presents an 
(unreasonably!) excellent framework for the discussion of phenomena. 

Let us now Set aside the question of the submicroscopic structure of 
space-time and concentrate, instead, on its large-scale properties. In this 
case, we may imagine that the smooth manifold picture will be adequate and 
that its structure in the large can be obtained by piecing together smaller 
‘locally Euclidean’”’ patches, in the manner of the overlapping coordinate 


Structure of Space-Time 123 


neighborhoods of differential geometry. Thus we might arrive at a topology 
for space-time, in the large, different from a Euclidean topology. Unfortu- 
nately, too little is known about the large-scale structure of the universe to 
enable us to make any statement with confidence concerning its global 
topology (apart, perhaps, from certain statements about its orientability). 
Thus, it might be that the topology of space-time on a large scale is not at all 
interesting. 

Nevertheless, irrespective of this, it will be very worthwhile, here, to 
consider questions concerning topology of space-time manifolds in some 
generality. There are, in particular, two different but somewhat related 
special reasons for this. To appreciate, fully, the first of these, we must 
picture the universe in its four-dimensional entirety, rather than in terms of 
some three-dimensional spacelike section (the “ present’’). The one obser- 
vational feature of the four-dimensional large-scale structure of the universe 
which has emerged with any semblance of certainty, is that at some “time” 
in the past (of the order of 10'° years ago) the material of the universe was 
apparently concentrated in a highly condensed and chaotic (hot) state. This 
follows from observed expansion of the universe and the equations of general 
relativity, if certain assumptions about the large-scale homogeneity of the 
universe are made. It also appears to follow more directly from the recent 
observation [25, 80a] of a general background of (electromagnetic) radiation 
permeating space. The present temperature of this radiation is about 3° 
absolute, which is what would be expected [33, 2] on the basis of a highly 
condensed general-relativistic “initial state.” (The cooling down of the 
radiation to its present value would be an effect of the expansion of the 
universe.) Now, if we take completely seriously the smoothed-out cos- 
mological models which are normally used in such calculations, we are 
actually presented with an initial singular state at which the curvature of 
space-time becomes infinite. In the neighborhood of the singularity, radii 
of curvature would become arbitrarily small—less than 107 '* cm; less, even, 
than 1073* cm. Thus, we would expect the picture presented by our model 
to be inadequate at such levels, if only because of the reasons mentioned 
earlier for doubting the validity of the smooth manifold description of space- 
time at such dimensions. 

But can we really trust our model in regions where the curvature even 
remotely approaches these values? We may expect that the deviations from 
homogeneity which are now present in the curvature of space-time (owing to 
irregularities in the matter density) could, if extrapolated backward to the 
early highly curved space-time regions, result in very great deviations from 
the smoothed-out picture. Indeed, is there reason to believe in the existence 
of a singularity at all? (We may take as a tentative definition of 
“singularity,” a region at which curvatures have become so large, that the 
local physics becomes drastically altered—perhaps because of a breakdown in 


124 R. PENROSE 


the smooth manifold picture of space-time.) Might it not be that when 
curvatures become only ‘‘ moderately” large, the deviations from the »ode/ 
could be enormous, perhaps resulting even in a topological structure totally 
different from that of the model? It will be one of the main purposes of the 
later lectures in this series to give some precise results which strongly indicate 
(although, perhaps, they do not quite prove) the existence of singularities in 
space-time, in generic situations, on the basis of general relativity. In order 
to achieve these results it will be necessary to consider quite complicated 
possibilities of a somewhat topological nature—even if such complications 
are not actually realized in the universe! 

This is the first of the two compelling reasons, alluded to above, for 
studying the subject of space-time topology. The second of these (leaving 
aside the question of relevance to the submicroscopic nature of space-time) 
concerns the question of gravitational collapse. The inherent instability of 
gravitation, as implied by general relativity, when too large concentrations of 
masses are present, is reflected in the existence of an initial singularity of the 
cosmological models. This instability becomes manifest again in the 
collapsing phase of those models for which the expansion does not continue 
idenfinitely and the universe returns to infinite curvature in a final singular 
state. But it is not necessary for the universe to be involved as a whole, for 
the effects of this instability to become manifest. In fact, even bodies which 
are not very much more massive than the sun may be expected to collapse 
catastrophically when they have exhausted their resources of internal energy. 
If the conditions are right, such a body will collapse past a “ point of no 
return”’ at which (roughly speaking) the gravitational forces become so 
strong that even the /ight emitted by the body is dragged inward and no signal 
can escape to the outside world. Beyond this point the behavior of the body 
is much like the final phase of a collapsing universe (or, presumably, the 
time-reverse of the initial phase of an expanding one). Singularities in 
space-time would be expected to result, but (in the situation under considera- 
tion) these singularities would not be visible from the outside. Instead, 
there ensues something of the nature of a “ hole”’ in space, into which objects 
can fall and out of which no object or signal can escape. Bearing in mind 
the mass range that might be likely to be involved, these ‘‘ holes’ could range 
in size from a few kilometers in diameter to a few times the dimension of the 
solar system. They would (apparently) be difficult—but not in principle 
impossible—to detect from outside. At the present time their probable 
existence is inferred theoretically rather than observationally, but in any case 
there are many intriguing questions of space-time topology involved in 
their study. 

The nature of space-time is not quite what it “‘seems’”’ to be—and in 
these respects it is not well understood at present. Undoubtedly, future 
theory will present us with many surprises and important new insights. But 


Structure of Space-Time 125 


even the present theory of space-time (and by this I mean Einstein’s 1916 
theory of general relativity) contains surprises and insights which are only 
just beginning to be explored in any detail. There is surely a great wealth 
of results which may still be fairly readily obtained by use of the appropriate 
mathematical tools. I hope that by presenting a few of the results which 
have been so far obtained and by indicating some tools which have recently 
proved useful, | may be able to encourage experts in other fields to take a 
(possibly active) interest in this subject. 


2 THE NATURE OF GENERAL RELATIVITY 


The most important single lesson of relativity theory is, perhaps, that 
space and time are not concepts that can be considered independently of one 
another but must be combined together to give a four-dimensional picture of 
phenomena: the description in terms of space-time. Dynamics then becomes 
an aspect of geometry. It is somewhat instructive to see how this idea may 
be applied even in prerelativistic dynamical theories. Thus, we may com- 
pare the dynamics of Aristotle, Galileo, or Newton with special or general 
relativity, by expressing all five types of theory in space-time terms. 

Let us consider five kinds of space-time, which I shall label as follows: 


ARISTOTELIAN SPACE-TIME (2.1) 
GALILEAN SPACE-TIME (2.2) 
NEWTONIAN SPACE-TIME (2.3) 
MINKOWSKIAN SPACE-TIME (2.4) 
EINSTEINIAN SPACE-TIME (2.5) 


In each case, the space-time will be a four-dimensional smooth manifold, but 
with some additional geometric structure which describes an important aspect 
of the dynamics. Any point of the space-time is really an “event,” that is, 
something one pictures as a point in space, but with only an instantaneous 
existence. The history of a particle is a curve' in space-time, called the world 
line of the particle. 

Aristotelian space-time is simply a product E® x E', where E” denotes 
the n-dimensional Euclidean space carrying the usual Euclidean metric and 
possessing the usual (4n(n + 1)-dimensional) group of motions. The metric 
of E> defines spatial separation and that of E' defines time-difference. Thus, 
In accordance with Aristotelian dynamics, it is meaningful to speak of an 
absolute spatial separation between two events, even when they possess a 
nonzero time difference. In particular, the state of rest of a particle is to be 
distinguished from all other motions, by the fact that the spatial separation 


1 Unless otherwise stated, a ‘“‘curve’’ will not carry a parameter (cf. section 9). 


126 R. PENROSE 


vanishes between any two points on the particle’s world line. The seven- 
parameter transitive group of motions of Aristotelian space-time is the direct 
product of the Euclidean group on E? with that on E’. 

Galilean and ‘Newtonian space-times differ from Aristotelian space- 
time in that the spatial separation of two points is only well-defined if their 
time difference vanishes. On the other hand, the time difference is always 
well-defined. Thus, the structure is more accurately that of a fiber bundle 
over E! with fiber E 3, so that the “‘ time” E' may be thought of as the factor 
space of the whole space by the fiber E*. (The topology is, of course, still 
the same as in the Aristotelian case, but the bundle structure is different.7) 
The distinction between the Galilean and Newtonian space-times arises only 
after some further structure has been imposed. For this, we single out a 
special (six-parameter) family of curves in the space-time which we refer to as 
the ‘‘ geodesics.””> (No extremal property of these curves is intended to be 
suggested here. We merely require that a unique such geodesic pass through 
each point of the space-time in each direction.) These geodesics (except 
for those lying in the E°? fibers) will then be the world lines of particles in 
inertial motion. To obtain the geodesics for Galilean space-time, we merely 
postulate that they agree with the system of straight lines in some E*, where 
the E> fibers are identified as a maximal system of parallel planes in E* and 
where the factor space E' is defined in the obvious way. The group of 
motions of the Galilean space-time which preserves this structure is the 
ten-parameter Galilei group. 

The definition of the geodesics for Newtonian space-time is more 
complicated. The idea, here [20, 108, 109] is totreat Newtonian gravitational 
theory from the point of view presented by Einstein’s general theory. Thus, 
we do not regard gravitation as a true force, but as an intertial or “ fictional” 
force along with the acceleration, centrifugal, and Coriolis forces. This is 
possible? because of the Galilei-Einstein equivalence principle, which states 
the equality between inertial and passive gravitational mass. Then a 
particle moving under the action of gravity, but not influenced by any other 
force, is considered to be moving inertially and its world line is to be a 


21 am grateful to A. Trautman for clarifying this point for me. 

3 In fact, the principle of equivalence really also tells us that it is necessary to take 
this view of Newtonian gravitation. A constant gravitational field (in the Newtonian 
sense), throughout the whole of space, would be totally unobservable, all bodies being 
accelerated equally. The intrinsic physics would be the same as if the field did not exist. 
The situation is somewhat similar to that of electric potential. If we add (for example) a 
constant potential over the whole universe, the intrinsic physics is totally unaltered. Thus 
we think of Newtonian gravitational “‘ field’’ as being really a type of potential. We have to 
differentiate to get the true field, namely, the curvature. We might imagine fixing the 
potentials canonically by requiring that they go to zero at infinity, but it is difficult to see 
how one would apply this in the actual universe, there being no evidence that the density of 
gravitating bodies falls off toward infinity. 


Structure of Space-Time 127 


geodesic. Many different Newtonian space-times are possible, corresponding 
to the various inequivalent gravitational fields. Galilean space-time is a 
special case, corresponding to zero gravitational field—or, equivalently, for 
example, to a field of constant gravitational acceleration defined throughout 
space-time. For, as in general relativity proper, the physically meaningful 
aspect of a gravitational field is contained only in the “tidal force’? which 
results from a nonuniform gravitational acceleration field. These tidal forces 
find expression in a “‘curvature’”’ for Newtonian space-time which describes 
the extent to which the intrinsic geodesic structure differs from that of the 
Galilean space-time (Fig. 1). The “field equations’? of Newtonian space- 


E3 


FIGURE 1. The curvature of Newtonian space-time is directly measured by 
geodesic deviation, that is, by tidal forces. 


time relate this curvature to the distribution of gravitating matter. These 
may be obtained by applying to limit c— oo to the equations of general 
relativity (c being the velocity of light) but the details of this will not be 
entered into here.* 

Minkowskian and Einsteinian space-times differ radically from the 
three previous cases in that no additive concept of time difference is defined 


* For the details of this, see Trautman [108, 109]. Trautman also considers the more 
general case where the Newtonian spaces of constant time need not be Euclidean E°’s. 
This enables ‘‘ Newtonian cosmology ’’ to be considered. 


128 R. PENROSE 


between events. Instead, a pseudo-Riemannian> metric form ds?, of 
hyperbolic normal signature (that is, +, —, —, —) is defined on the space- 
time. The time difference between two points A, B in the space-time depends 
on the choice of world line connecting the points and is given by the integral 
of ds along the world line: 


t= [ ds (2.6) 


For an allowable (that is, “‘ timelike” or “‘ null’’) world line, we have ds* > 0 
everywhere along the curve, whence 7 is, in fact, a real number. The quantity 
t defines the time interval (proper time) between A and B as measured by an 
(idealized) clock whose world line is the given curve. Since proper time is 
now a path-dependent concept, we can return to the definition of a 
‘* geodesic”? as an extremal path. The system of (timelike) geodesics thus 
obtained, defines (according to the theory) the inertial motions of particles. 
Unlike in the Galilean and Newtonian cases just considered, the inertial 
motions are now fixed once the behavior of (idealized) clocks has been 
specified. 

The relation between Minkowskian and Einsteinian space-times 
mirrors that between Galilean and Newtonian space-times. Thus, 
Minkowskian space-time has a unique description (for example, its geodesic 
structure is the same as that of E“*, the set of timelike geodesics corresponding 
to geodesics in E* making an angle of less than 45° with some fixed direction) 
and it does not describe gravitation. It has a ten-parameter transitive group 
of motions, namely, the Poincaré (that 1s inhomogeneous Lorentz) group. 
On the other hand, many Einsteinian space-times exist, corresponding to the 
various inequivalent gravitational fields. As with the Newtonian space- 
time, it is the tidal gravitational force which has physical significance, this 
being present when there is a gradient of the “gravitational acceleration 
field.” This tidal force is described in terms of deviations of the intrinsic 
geodesic structure from that of Minkowskian space-time, that is to say, by 
the curvature of the Einsteinian space-time [cf. (7.8)]. Einstein’s field 
equations then describe how this space-time curvature is to be related to the 
density of matter (that is of stress-energy-momentum). 

Before adopting the Einsteinian point of view, it is reasonable to 
question whether space-time actually possesses a uniquely and accurately 
defined pseudo-Riemannian structure. This depends on the existence of 
accurate clocks in nature, on the fact that such clocks behave locally accord- 
ing to the laws of special relativity and on the fact that for two coincident 


> The word ‘“‘ Riemannian”’ is generally reserved for the case of a positive-definite 
metric; thence the term ‘“‘ pseudo-Riemannian”’ for indefinite metrics. Sometimes the 
word ‘“‘ Lorentzian ’’ is used for a pseudo-Riemannian manifold of signature (-+, —,..., —-). 


Structure of Space-Time 129 


clocks, the time rate they register is in a well-defined ratio which does not 
depend on their location in space-time nor on their histories. It seems that 
the existence of accurate clocks is deeply related to the quantum nature of 
matter. Ultimately this comes down to the fact that a natural frequency v 
is associated (via Planck’s constant /) with any mass m. For, combining 
Einstein’s mass-energy law E = mc? with Planck’s formula E = hy (and 
choosing’ units, as we shall, henceforth, so that the velocity of light c = 1) 
we have 


y= mh"! (2.7) 


Any fundamental particle therefore defines a scale of time via its own rest 
mass m. The time-scale may be thought of as a series of “‘ticks’’ along the 
world line of the particle,® these being (by definition) a separation v~‘ apart. 
This gives us a definition of interval ds along the particle’s world line and by 
allowing this world line to vary, we have ds for any timelike interval. Because 
of the very great local accuracy of special relativity, this then serves to define 


tick 3 


FIGURE 2. The ds defined by a particle moving in different timelike directions at 
a point P is consistent with a pseudo-Riemannian structure. This is 
because of the local accuracy of special relativity. 


© In practice, the Planck frequencies of elementary particles are extremely high and 
Cannot be used directly as clocks. The corresponding Planck frequencies for compound 
systems are even higher! Essentially the frequencies used in atomic or nuclear clocks are 
obtained from differences of masses (that is, ‘* beats ”’). 


130 R. PENROSE 


a pseudo-Riemannian structure for space-time (see Fig. 2). Finally, it 
appears that the pseudo-Riemannian structure we obtain is independent of 
the choice of particle that we use to define it. This is because the masses of 
elementary particles seem to be in a definite fixed ratio to one another, which 
is independent of their location in space-time or their histories. (If this were 
found not to be the case, then we should only have a uniquely defined confor- 
mal geometry, that is, a metric up to a local factor.) 

So far, I have made no mention of spacelike separations, even though 
one normally thinks of Riemannian geometry in terms of distances rather 
than times. This is because a distance measurement Is really a more com- 
plicated process than a time measurement. Given any two nearby points 
P, O with a spacelike separation, we cannot just put a “‘ruler” between them 
to measure their separation. A ruler is not really an appropriate object for 
measuring distances between events. In space-time, a ruler appears as a 
two-dimensional timelike strip. Unless we arrange that the points P, O on 
the two edges of the strip are “‘simultaneous in the ruler’s rest frame,”’ we 
shall get the wrong answer for the interval PQ (Fig. 3). In fact, the rigidity 
of a ruler is also a complicated matter and depends ultimately on the inter- 
sections between the “‘ clocks’”’ defined by its constituent atoms. To measure 


FIGURE 3. A ruler wrongly measuring a spacelike interval PQ. The events 
P, Q are not simultaneous in the rest-frame of the ruler. 


Structure of Space-Time 131 


spatial intervals, it is actually more direct to use a clock’ and reflected light 
signals, such as in the manner indicated in Fig. 4. The spacelike interval PO 
is then equal to the timelike interval between emission and reception of the 
signals (assuming the signals are emitted in opposite directions relative to the 
clock, that is, that the whole figure lies in a timelike plane). 


Clock 


FIGURE 4. A clock correctly measuring the spacelike interval PQ by use of 
reflected light signals. (The whole figure must lie in a plane.) 


Let us, then, agree that space-time, via accurate clocks, possesses a 
uniquely defined natural pseudo-Riemannian metric (of signature +, —, 
—, —). It is still reasonable to question whether the geodesics defined by 
this metric are likely to have anything to do with the inertial motions of 
particles. When general relativity was first put forward, the identity of the 
inertial motions with the timelike geodesics was assumed as a postulate of 
the theory. However, Einstein and Grommer [30] were later able to show,® 
that this property is actually a consequence of Einstein’s field equations 
[cf. (7.1)] if certain reasonable assumptions are made about how a (test) 
particle is to be represented as a limitingly small distribution of energy- 
momentum. The argument is a delicate one, but it depends ultimately on 
the existence of a covariant conservation law for the energy tensor [cf. (8.1)]. 
Beyond this, it does not depend critically on the exact form of Einstein’s field 
equations. 

It is also possible to run the argument the other way, that is, to regard 
inertial motions as primary and to try and construct the metric as a secondary 
concept. According to an argument of Weyl [I11] and of Marzke and 
Wheeler [56], if both the timelike geodesics (inertial particle world lines) and 


7 Synge [103] has particularly emphasized that the study of space-time geometry is 
essentially ‘‘chronometry.”’ 
8 For an important later paper see [31]; for a more recent survey, see [36]. 


132 R. PENROSE 


null geodesics (unscattered light rays) are known, then the metric of space- 
time can be constructed uniquely up to an overall factor. This, in a sense, 
has some advantage over the definition in terms of clocks, since the primary 
concepts do not depend on parts of physics (that is, quantum theory) which 
may be felt to be foreign to general relativity. My personal preference for 
regarding the clocks as more basic arises partly from the fact that one needs 
a number of apparently ad hoc assumptions on the geodesic structure of 
space-time in order that it be consistent with any metric at all, so it ceases to 
be clear that the structure of space-time is actually a pseudo-Riemannian one. 
In addition, it is my personal view that there is a deep connection between 
quantum theory and general relativity, so that it may actually be a mistake 
to attempt to build the subjects up separately. 

For general relativity to be a subject of relevance to nature, it is necessary 
that the pseudo-Riemannian structure of space-time be not only accurately 
defined, but also nontrivial (that is, with somewhere nonvanishing curvature). 
The fact that the space-time metric is curved? is actually a consequence of 
very primitive considerations [96] concerning the nonexistence of a perpetuum 
mobile. Consider a closed chain of buckets [9, p. 418] stretched between 
two pulleys, the axes of the two pulleys being fixed relative to the earth at 
different gravitational potentials (Fig. 5). Each bucket contains an atom 
which is capable of two states, a ground state and an excited state. The 
atoms on the left-hand side are all excited while those on the right are in the 
ground state. Now, the excited atoms, possessing more energy than the 
ground-state atoms, are also more massive. Therefore (by a very weak 
version of the principle of equivalence), they must also weigh more. We 
assume that there is no friction in the system, whence the chain begins to 
rotate in a counterclockwise direction. When each bucket reaches the 
bottom it is induced to give up its excess energy in the form of a photon which, 
by means of mirrors, is reflected back to the top atom. Now if the photon 
could be reabsorbed to excite the top atom we would have a perpetuum 
mobile since the process could be repeated indefinitely and energy could, in 


° The following argument will not show that space-time is conformally curved (that is, 
with a metric not expressible in the form ds? = f(x, y, z, t) {dt? — dx? — dy? — dz}). 
For this, the most direct argument is the bending of light by the sun. Space-time curvature 
seems to be about the only way of “‘bending”’’ light without the different frequencies 
being bent differently (chromatic dispersion). In a conformally flat theory (for example, 
Nordstrém’s [66] gravitational theory), there would be xo resultant bending of light by 
gravitation (essentially because null geodesics are conformally invariant). It may be taken 
that the existence of some achromatic bending is experimentally established. 

The most obvious manifestation of space-time curvature is the existence of planetary 
orbits. I do not choose to use this as the primary reason for believing in the curvature of 
space-time, because the connection of curved orbits with the behavior of clocks is not very 
direct, depending on the derivation of equations of motion from field equations on the 
metric (see above). 


Structure of Space-Time 133 


eee 


mirror 


FIGURE 5S. Bondi’s endless chain of buckets. The weight of the extra energy of 
the excited atoms would cause the chain to rotate. 


principle, be extracted from the continuing rotation of the chain. Thus, we 
must assume that the photon is not capable of being absorbed by the top 
atom. Presumably, if Newton had been presented with this situation, he 
would have had a simple answer: The photon is “‘ weakened’”’ by the time it 
reaches the top and possesses insufficient energy to excite the top atom. But 
because of quantum theory, we know that a photon cannot be just ‘‘ weakened.” 
By Planck’s formula, it must also be of a /ower frequency when it reaches the 
top than that required to raise the top atom to its excited state. Thus, in a 
well-defined sense, clocks must run more slowly at the bottom of the chain 
than at the top. Thus, the metric of space-time, as we have defined it, must 
differ, in a gravitational field, from the ordinary flat space-time form. (This 
is a highly idealized piece of apparatus, but it is of interest to note that this 
Clock slowing effect has actually been directly observed on the earth, by 
Pound and Rebka [85] by use of the Méssbauer effect.) Note that the 
principle of equivalence is only invoked in an exceedingly weak form, namely, 


134 R. PENROSE 


that energy just has some weight. (An extreme version of the experiment 
would be to employ z° mesons rather than atoms. These can decay entirely 
into photons, leaving the right-hand buckets empty. Thus, the argument 
will apply if z° mesons weigh anything at all!) 

This argument is not, so far, quite sufficient to show that space-time is 
curved, however. The metric might still be flat but represented in an unusual 
way. From the spherical symmetry of the earth’s field this is readily seen to 
be, in fact, impossible, since it would imply that the metric of the earth’s field 
is related to a standard Minkowski space-time by transformations involving 
an acceleration outward from the center of the earth in all directions. The 
matter will not be pursued in further detail here, except that I feel it will be 
instructive just to examine the “uniform acceleration” transformation of 
Minkowski space-time explicitly. This illustrates quite clearly how a static 
situation can appear to involve a gravitational field, while the space-time is 
actually flat. This example also shares some features of the relation between 
the Schwarzschild and Kruskal representations of a true spherically sym- 
metric gravitational field, as we shall see later ((10.1), (10.4); see also 
Bergmann [6a)). 

Let x, y, z, t be standard Minkowskian coordinates, so that the metric 
takes the form 


ds? = dt? ~ dx? ~ dy? — dz? (2.8) 


Introduce new coordinates X, Y, Z, T related to the original ones in the range 
z> |t|; Z>0, by 


x= X,y= Y,z =Zcosh T,t = Z sinh T (2.9) 
(see Fig. 6). The metric now becomes 
ds* = Z* dT* — dX* — dY* — dZ? (2.10) 


The form (2.10) has the appearance of a static gravitational field with Z 
playing the role of a gravitational potential. The term (Z dT)? in (2.10) 
in place of the dt? seems to indicate that “‘ time is running more slowly” near 
Z=0. Any particle held “fixed” in the X, Y, Z, T system (that is with 
X, Y, Z constant) would experience a constant force in the direction of 
decreasing Z (since the particle’s world line is ‘“‘ really’’ a constant accelera- 
tion “hyperbola” in the x, y, z, tf system). A freely falling particle would 
appear, in the X, Y, Z, T system, to fall toward Z = 0 and then slow down 
and approach Z = 0 asymptotically (see broken curve in Fig. 6). There is an 
apparent “‘ barrier’? a Z = 0 which particles do not seem to be able to cross. 
That this is merely an effect of the inadequacy of the (XY, Y, Z, T) coordinate 
patch becomes evident if we transform back to the form (2.8). It will be 
useful to bear this feature in mind when we come to consider the discussion 
of gravitational collapse given in Section 10. 


Structure of Space-Time 135 


apparent barrier 


FIGURE 6. The uniform acceleration transformation. The apparent barrier 
at Z=0 has no physical reality. 


3 THE ABSTRACT INDEX NOTATION 


When performing calculations in general relativity it 1s frequently 
necessary to operate with tensors of quite high valence.'° Even such a basic 
quantity as the curvature tensor has valence four, and it possesses the familiar 
somewhat complicated symmetries. This makes it practically imperative 
that an index notation be employed for many calculations, so that the 
different connections between the quantities involved may be easily kept 
track of. It seems that there is a common feeling among mathematicians 
that such notations are to be avoided, presumably because of the connotation 
that their use entails explicit reference to a particular basis frame. However, 
when a physicist refers to “‘g,,”’ or “R*,.4," 1 do not think that he usually 
means to be referring to a set of frame-dependent components but rather toa 
physical, frame-independent object which these components represent. But, 
the index notation allows a very convenient set of algebraic operations to be 


‘ 


10 The term ‘‘valence’’ is used here in preference to ‘‘rank’’ because it is more 
descriptive and because the word ‘“‘rank’’ has other connotations in the case of matrices. 


136 R. PENROSE 


applied to such objects, which produce new objects—these operations being 
actually completely frame-independent. The algebraic operations are, in 
essence, extremely simple, but they also allow great flexibility in the building 
up of more complicated operations out of simple ones. It would seem a 
great pity to forbid oneself the use of such a powerful and flexible notation 
merely because of some uneasy feelings about summation conventions and 
dependence on special basis frames. What I shall present here is an entirely 
frame-independent algebra which allows one to calculate with indexed 
quantities exactly as before (but now with a clear conscience!) and which, 
by use of a notational device, even permits a greater freedom than before, 
when it comes to introducing coordinate systems and basis frames (compare 
Schouten [96a]). The advantages will be particularly apparent when we 
come to consider spinors in the next section. 

Let us not be completely formal, so we may be able to save some time 
and complication. I hope the essential ideas will be clear. Consider a 
vector space V° over a field F—or, more generally, we allow V° to be a 
module,!! where F is a ring of suitable type (for example, the elements of V° 
could be vector fields!” and those of F, C® functions, on a manifold). The 
idea will be to construct what is essentially the usual tensor product of V° 
a number of times with its dual’? V, a number of times, but where, by use of 
indices, we can keep track of the effect of symmetries and contractions easily. 
This is done by simply mirroring the usual index notation (with summation 
conventions, etc.) but where now the indices a, b, c, ... are not to be thought 
of merely as generic symbols standing, say, for 0, 1, 2,..., N, but as abstract 
labels. We shall require an infinite supply 


LS CROERODEE¢ PEE 5) RPMS 5 1 2) EER © (3.1) 


of abstract labels, so that expressions of arbitrary length can be built up. Let 
L denote the set of labels (3.1). For any element & of V° and any label 
x EL, we shall allow ourselves to write a symbol ¢*. As & ranges over the 
elements of V°, the associated object €* ranges over a corresponding set V”. 
It should be emphasized here that &* is an entity in its own right and not the 
set of components of € in some frame. Now, since we wish to mirror the 
usual tensor rules for indexed quantities, we are not permitted to write 
E74 4°, but €7+ 47 and €° + 7° will be both allowable. (We must think of 
E* and €° as different objects.) For any A €F, we shall also be permitted to 
write AE*, Thus, each of V%, V’, ..., V%, ... is a vector space or module 
canonically isomorphic with V°. 


11 A module differs from a vector space in that the scalars form a ring with identity 
rather than a field. A ring differs from a field in that division by nonzero elements is not 
always possible. 

12 Here a ‘‘field’’ means a cross-section of the appropriate vector bundle over —. 

13 The space of all linear mappings of the module V’ into the ring F. 


Structure of Space-Time 137 


It may be felt that it is unnatural to introduce an infinite number of 
isomorphic spaces, when actually we only have one space. But we may view 
the situation in a slightly different way. Each element of L is really just a 
kind of organizational marker which keeps a tag on a particular vector 
(etc) irrespective of where it may occur in an expression. Thus, &* is just a 
pair, consisting of € together with the marker x. That is to say, it is an 
element of V° x L. We then have V7=V° x(a), V’=V° x (b), etc. 
The vector space or module axioms will, of course, apply to each V*: 


OF + (n® + 0%) = (C8 +) + 
A(S* + 1) = AC* + An* 
(A + po = AC™ + c* 


(3.2) 
A(uc*) = (Ap)e* 
Le ee di 
O€* = On* 


Here J, p, 1,0 € F with | and 0 being the multiplicative and additive identities, 
respectively. We also have €* + y* = n* + ¢* (expanding (1 + 1)(€* + n*)) 
and &* + (— &*) = 0 (writing — &* for (— 1)* and 0 for On”). 

The dual space V, will also have an infinite number of canonically 
isomorphic copies: V,, V,,---, Va,,---- We may think of V, as being the 
dual space of V* for each xe L. The elements of V, are linear mappings of 
V* into F. Thus, for 6, € V, we have 


O,(C* +n") = 0.0% + On" (3.3) 
O,(AS*) = A(8, 6") (3.4) 


where the effect of the mapping 8, on €* is written simply 6,¢*. We shall 
also allow this to be written in the reverse order: 0,¢* = €*0,. We require 


0,6 =O0,8 =- = OG =: (3.5) 

Now each of V,, V,, ... will be vector space or module where AO, and 
0, + d, are defined by 

(AO,)o* = ACO, o*) (8, + x)O* = 8,0 + OO (3.6) 


The idea will be to use the elements of F, V*, V’, ..., V,, V5, --- to 
generate our algebra. To see what the rules of this algebra should be, we 
must recall what the rules are for the ordinary tensor index notation. We 
note, for instance, that products such as €%’ will be permitted, whereas 
€*y? will not. Furthermore, the allowable products must be commutative: 


gay? = hE (3.7) 


138 R. PENROSE 


but in general &%y? 4 y°E°. The requirement (3.7) shows us that while in 
essence Ey is just the tensor product of elements €* @ 7”, we cannot simply 
identify En? with €7@y?. For tensor products, according to the strict 
technical definition, are not commutative. Here we are allowed to define a 
commutative version of a tensor product, essentially for the reason that 
E*n* is not defined. Ina product £77”, it is the /abels a and b that tell us which 
factor is which, not the ordering of the factors. One method (suggested to 
me by S. Mac Lane) of precisely defining the type of product used here is to 
take the symmetric algebra [54a] on the direct sum V’@V,@V’O@OV,4... 
and then, for each pair of disjoint (finite) sets of elements of L, say a, p, rand 5, 
m, we select the corresponding subspace V;?" spanned by the elements of 
the form 


on 7C's Pm (3.8) 


The general element of V#"" will be a linear combination of expressions like 
(3.8). The construction so made, ensures that each product (3.8) is fully 
commutative and also that the various distributive laws hold (for example, 
wePr(0, + ¥,) = WaPrO, + wWePry, with were ViPr). There is no significance in 
the ordering of a, p, r in V3?" or of b, m. Thus Vii = VPir = vir? etc. 
However, the ordering of the indices for an element p;?" is significant. 
Every element p3?’ e V3?" is a linear combination of commutative 
products of the type (3.8): 
M (ii) () @ ( (i) 
Pom = DAS PL Os bn (3.9) 


but there will be many ways of expressing p;° as such. A convenient 
criterion for the equality of two expressions (3.9) for the modules that we are 
interested in is that for every choice a,€V,, B,€V,, y,EV,, o EV’, 
t”™ € V™, the scalar 

M (i) (i) (i) i) W@W 


Pom %a By yp OT" = py A (So 44)(1°B,)(C79,)(05 V(t ") (3.10) 


should be the same for both expressions. From this all the algebraic proper- 


apr 


ties will follow. (Note, that in general ps’? # pPer 4 p°h", etc.) 


a4 


The entire tensor system {V} consists of all of the V¥:"%,, including 
V=F: 
1M) SVN OV ptces Vik Vides eg Vo oe) 


There are four basic operations on {V}, namely 


ADDITION: Miwa ae (3.11) 
MULTIPLICATION: Vo kv (3.12) 
INDEX SUBSTITUTION: W%772 3 VE--% (3.13) 
(a, b)-CONTRACTION: bea a (3.14) 


Structure of Space-Time 139 


In (3.11), (3.12), and (3.14), the differently denoted index letters appearing 
are all assumed to be different elements of L. In (3.13), the elements x, ..., 
z,u,..., Ww of Lare all distinct and so are f,...,4,k,...,m. Otherwise they 
are unrestricted except that x,..., zand f,..., 4 are equal in number and that 
u,..., wand k, ..., m are equal in number. Addition and multiplication 
are defined the obvious way. Index substitution’ is induced simply by a 
permutation applied to L. (The validity of any equation is unaffected by a 
permutation of the elements of L.) To define contraction, we consider, for 
example, the (p, b)-contraction: Vpr + Vir as applied to the element p37) e VPar 
given by (3.9). The result 1s 
JOkO MO MORORG) 
Pim = 2.4 (n* O,)0° CO Om (3.15) 


We have p2*" e« V2" so the x labels are ““dummies” which do not contribute 
to the total valence type. 

It may be verified algebraically that all the usual tensor rules'* follow 
from the above constructions. Thus, addition gives an Abelian group 
structure to each V¥"-%,. Multiplication 1s commutative and distributive 
over addition. Contraction appropriately commutes with addition, with 
multiplication, and with other contractions. The contraction of a zero 
element is again a zero element. (If we use equality between the scalars 
(3.10) as the definition of equality between formal expressions (3.9), then this 
last property follows from the property that any matrix over F whose square 
vanishes also has vanishing trace. This property holds in the cases which 
interest us here, but would not be true for certain rings of finite characteristic.) 

So far, the question of a basis frame for V° has not even arisen. 
However, it is often convenient to work with basis frames and we shall need 
a notation to be able to distinguish basis indices from the abstract labels. 
I shall adopt the convention that German indices a, b, ..., ag, ... will denote 
the numbering of basis elements in the standard way, that 1s, each of a, b,... 
denotes one of the integers 0, 1, ..., N [for an (N + 1)-dimensional space]. 
The use of German indices will be to remind us of two things; first that a 
choice of a (possibly arbitrary) basis frame is involved in any expression 
containing such indices, with a consequent loss of covariance; and secondly 
that the Einstein summation convention is being used whenever repeated 
indices occur in a term in an expression. Now, let 6),6,,...,5y¢V° bea 
basis for V° (assuming finite-dimensionality) and let 6°, 6’, ..., dVeV. be 
the corresponding dual basis. For xe L we have canonical images in V* 
and V,: 


5z...,6f6V%,  069,..., %eV, (3.16) 


14 See any standard work on classical tensor calculus, for example, [100]. 


140 R. PENROSE 


We may use the generic symbols 6% € V*, 6% € V,,, so the basis orthogonality 
relation takes the form 


5x OX = OF (3.17) 


where 6? is the ordinary Kronecker delta symbol. (No relation between x 
and x is to be implied by the notation.) We can also define an element 
0; of V> by 


0,0; = OF (3.18) 
The quantity 5% satisfies the usual properties 


Cy Cr pe OFS Ty (3.19) 
and is clearly actually independent of the choice of basis, by Une of fe 
The components, with respect to the basis, of any element C3: i, 6 Vi" 


given by 
SSC nO. wis iO Oy ax Oe (3.20) 
Conversely, to express pe -% in terms of its components, we simply write 


Cw = On Oe + OW Oe «+. OF (3.21) 


Ww 3 


One aspect of this notation which is an advantage in certain contexts is that 
we may convert some indices into component form and leave others as 
abstract labels: 


peer = parrs 5° @ Var (3.22) 


Generally, all algebraic relations will be unaffected by whether or not an 
index is in German type. But when we consider covariant derivatives in the 
next section, we shall see that an important formal difference arises between 
the ways the two types of index are treated—in addition to the present 
merely conceptual difference. 

An elementary, but important, property of the type of algebraic 
structure that we have built up here, is that we can sometimes embed one 
such structure in another by the device of grouping together indices. Thus, 
we may consider a new labeling set L, say, whose elements are (disjoint) 
subsets of elements of L. For example, we could put a = abc, B = def, 
y =ghi, etc., where L is divided exhaustively into disjoint triplets, these 
triplets being the elements of L. It is clear that, in this case, the sets V* = 
Be eV cig Vee paced Va = Von ... will satisfy just the 
same rules as before. More complicated groupings are also possible. 

We may also consider systems where the labeling set contains elements 
of different “types.” The only modifications of our scheme would then 
come in index substitution (where only labels of the same type may be 
substituted for one another) and in contraction, where only an upper and 


Structure of Space-Time 14] 


lower indices of the same type may be contracted together. When we 
consider spinors in the next section we shall see an example of this kind of 
system. We shall have labels of two different types (related to each other 
via an operation of complex conjugation). 


4 SPACE-TIMES WITH SPINOR STRUCTURE 


In order to discuss the structure of space-time in detail, we shall need 
a definition of a differentiable manifold. I shall use the definition as given 
by Chevalley, cf. Nomizu [65], which gives the differentiable structure of a 
manifold .@ entirely in terms of the C® real-valued functions on .@. Ihave 
chosen to use this definition here, not because of any special intrinsic math- 
ematical merit, but because I feel it reflects most nearly, the role that is played 
by coordinate systems in physics. Thus, a coordinate system in space-time 
is actually a set of four real-valued functions on the space-time such that, in 
some neighborhood, knowledge of the values of these functions will fix the 
point in a “smooth”? manner. 

Let us start with space-time .@ as a topological space. A set of sub- 
sets of .@, referred to as the open sets, is singled out subject to: 


THE INTERSECTION OF ANY TWO OPEN SETS IS OPEN. (4.1) 
THE UNION OF ANY SET OF OPEN SETS IS OPEN. (4.2) 


THE HAUSDORFF CONDITION HOLDS (THAT IS, FOR ANY TWO POINTS 
OF .4@, TWO DISJOINT OPEN SETS EXIST EACH CONTAINING ONE 
OF THE POINTS). (4.3) 


MM \S CONNECTED (THAT IS, NOT THE UNION OF TWO DISJOINT 
NONEMPTY OPEN SETS). (4.4) 


THERE IS A COUNTABLE NUMBER OF OPEN SETS, OF WHICH EVERY 
OPEN SET ON .@ IS A UNION. (4.5) 


The concept of an open set is a very nonphysical one, but it is essential to the 
mathematical idea of a manifold. To dispense with the open set concept 
we would have to change the theory. The question of defining open sets 
in terms of the more physical notion of causality will be touched upon in 
Section 11. 

We now single out a certain set T, of real-valued functions on .Z, 
which will turn out to be the C® real functions on the space-time .7@. We 
can characterize the differentiable structure of .@ by the following axioms 
on T: 


IF fi, fo>--->t, © T AND IF d IS ANY REAL-VALUED C®-FUNCTION OF 
n VARIABLES (IN THE ORDINARY SENSE), THEN $(/,,/,--->/n)€ 7; (4.6) 


142 R. PENROSE 


IF f IS SUCH THAT, FOR EACH P € .@ THERE EXISTS AN ELEMENT 
Sip) € T, WHICH AGREES WITH / THROUGHOUT AN OPEN SET CON- 
TAINING P, THEN fe T; (4.7) 


For EACH P € .@, THERE IS AN OPEN SET / CONTAINING P AND 
FOUR ELEMENTS x°, x!, x2, x? € T SUCH THAT THE MAP WHICH 
CARRIES Q € # INTO (x°(Q), x'(Q), x7(Q), x3(Q)) IS A HOMEO- 
MORPHISIM ONTO AN OPEN SUBSET OF R*, AND SUCH THAT ANY 
fe T AGREES, ON SY, WITH SOME C®-FUNCTION OF x°, ...,x°. (4.8) 


The property (4.8) expresses what is required of the functions x°, ..., x° 


on &@ that they should be adequate as Jocal coordinates about P. 

We can define a C® contravariant vector field & on .@ (that is, a C® 
cross section of the tangent bundle of .@) as a derivation on the algebra T 
(over the constants) that is, a mapping €: T -> T such that 


&(f + 9) = &(Ff) + &) (4.9) 
(fg) =f&(g) + 98(f) (4.10) 
E(k) = Oif k is constant on 4 (4.11) 


The set T° of such &’s is clearly a module over T, with (§ + n)(/) = &(/) 
+ n(f), ASS) = AES) (AET; &, ne T’). It can be shown that, in terms 


of local coordinates x°,..., x? in an open set Se .M, any €E T° has the 
form 
7) 
S = é° ax? (4.12) 
x 


throughout Y. Thus, we can think of €° as the components of &€ in the 
coordinate system x* on ¥. (When we pass to another local coordinate 
patch, the components €° transform in the overlap region in the familiar way.) 

At this point we could introduce, via V° = T°, F = T, the system of 
tensors with labels, as in Section 3, and we should arrive at a coordinate-free 
‘abstract index” version of the familiar tensor calculus. What I propose 
to do here is something slightly different. I wish to present a somewhat 
unusual development of space-time theory. The reason for this, touches 
on what is perhaps an essential difference between mathematics and physics. 
In mathematics, one is interested in formalisms of maximum generality, so 
that any results obtained will have the widest possible application. Some- 
times these applications are very far removed from the original topic, and 
great conceptual unifications may be thereby achieved. In physics, the aim is 
somewhat different. We are given a very specific (?) structure which we do 
not adequately understand—namely, the universe. Certain aspects of this 
structure are mirrored very accurately (in a miraculous way!) by certain 
mathematical models. However, we know, for various reasons, that these 


Structure of Space-Time 143 


models are not quite (?) right. Thus, we are always interested in altering 
the mathematical theory. Now a formalism for physics which contains even 
an arbitrary parameter (such as the dimension of differentiable manifold, 
for example) is in a sense too general. We search for a formalism which is 
specific rather than general. Sometimes, by reformulating an old theory 
in an unusual (though mathematically equivalent) way, previously unexpected 
possibilities for modifying the theory may appear as mathematically natural. 
An example of this would be Newtonian gravitational theory. In the 
normal version of the theory, one might, for example, envisage changing the 
exponent —2 in the inverse square law to some other power. (In fact, such 
alterations were tried, in order to explain the motion of Mercury, but this 
does not work!) On the other hand, if Newtonian theory had been refor- 
mulated as a space-time theory in the way indicated in Section 2, then a quite 
different alteration—namely, to some form of general relativity—would have 
appeared as mathematically natural. 

What I propose to do here is introduce a formalism for the description 
of space-time which makes its spinor structure’? emerge as more basic 
even than its pseudo-Riemannian structure. In fact, the particular dimen- 
sionality and signature (+, —, —, —), will have to be built in at the outset. 
Thus, if a modification of the present-day differentiable manifold picture of 
space-time were to emerge, which depended essentially on the existence of 
such a spinor structure, then the dimension and signature of our space-time 
would be one of the consequences of the theory. 

The formalism will be based on the isomorphism?!® between the group 
SL (2, C) of unimodular (2 x 2) complex matrices, and the twofold (universal) 
covering of the connected component of the Lorentz group O(1, 3). To 
express this isomorphism in its most direct form, consider the general (2 x 2) 
Hermitian matrix, which I shall write as 


00’ 01’ 1 0) 1 2 - 3 
u u uo+u u“ + lu 
; oy een een 4.13 
(i utr) eS (i iu? u° “| ( ) 


If the matrix (4.13) is multiplied on the left by a unimodular (2 x 2) complex 
matrix (t™%,) and on the right by the conjugate transpose of this matrix, then 
both the Hermiticity and the value of the determinant of (4.13) will be 
preserved. Thus, we obtain a linear transformation of (u°, u', u2, u*) which 
preserves both the reality of u° and the form 


Jay uu? = (u°)? — (wu)? — (u?)? — (u?)? (4.14) 


157 am using the phrase ‘‘spinor structure’’ rather than “spin structure,’’ to 
emphasize that in addition to being a spin manifold, -€ must possess a structure defined 
by the particular type of spinor system used here. 

16 The (2— 1) isomorphism between SU(2, 2) and the connected component of 
O(2, 4) can also be used in the study of space-time structure (cf. [79, 80)]). 


144 R. PENROSE 


where 
(Ja) = diag (1, -1, —1, —1) (4.15) 
This is therefore a Lorentz transformation on u*: 
ut — I, u° (4.16) 


which is continuous with the identity (since t% is continuous with 63). 
Conversely, it follows (for a variety of reasons) that any such transformation 
(4.16) arises from precisely two unimodular matrices (t™%,), namely, (t™,) 
and (— t%). Thus, (4.16) is equivalent to 


ut’ _, 2 2, BP’ (4.17) 


(The notation, here, is to regard Wf and QW’ as distinct letters, so, in particular, 
no summation is implied between them. Thus, (4.17) is four relations. 
When complex conjugates are taken, unprimed indices become primed and 
primed indices become unprimed. Thus, 7° ,. is the complex conjugate of 
t°,, for example. Capital German indices run over the two values 0, 1 or 
0’, 1’... Lower case German indices will run over 0, 1, 2, 3.) 

The usual way to think of the components of a world vector in a 
pseudo-orthonormal frame is as a linearly ordered array (u°, u', u?, u°). 
But alternatively, the matrix version (4.13) would be just as good, and the 
idea here will be to regard this matrix description as more basic. In this 
way, a world vector appears as not the most elementary “ vectorial”? quantity 
in space-time, since it is now a “divalent” object and we would expect that 
there might also be “univalent”’ (two-component) objects with some space- 
time meaning. In fact, this turns out to be the case. These “‘univalent”’ 
objects will be called spin-vectors. The idea, then, will be to take as our 
basic module (the “‘V° ”’ of Section 2) not the set of world-vector fields T°, 
but the C® spin-vector fields'’ S* on .@. The fact that I have not yet 
defined a spin-vector in space-time terms will not matter. The procedure is 
more the other way around. The algebra generated by the spin-vectors will 
serve, instead, to define the structure of the space-time. Once we have this 
structure we shall then be able to go back and interpret the spin-vectors in 
more familiar terms. We shall, in fact, end up with a remarkably complete 
interpretation of a spin vector in “ physical ’’ terms (cf. Section 5). 

Now what intrinsic structure should our module S° of spin-vector 
fields possess? At each point Pe .@, we require the fiber S°(P) above P 
to be a two-dimensional complex vector space (“‘spin space”’). The sym- 
metry of this space is to be given by the group SL(2, C), so that the 
local Lorentz group symmetry of space-time will be reflected in spin space 


17 The system S* will turn out to be the C® cross sections of the bundle of spin- 
vectors over -@, regarded as a module over the C” complex valued functions on 4. 


Structure of Space-Time 145 


symmetry. If w°, w’ are the components of w € S*(P) in some basis frame, 
then an (active) symmetry transformation of S°(P) will be given by 


w" — t *,w® (4.18) 


(keeping the basis frame fixed), the matrix (t%,) being complex and uni- 
modular. Comparing (4.18) with (4.17) and (4.16) [via (4.13)] we see that 
if we regard the world-vector space T° (P) as being actually the tensor product 
of S°(P) with its complex conjugate, then the allowable (active) symmetries 
of T°(P) become just Lorentz transformations which do not involve space 
reflections or time reversals. Each element of the module S° will be disjoint 
union over all Pe .4@, of an element from each S°(P). Some differentiable 
structure will be imposed as well. This will be defined axiomatically, shortly. 
An object built up of tensor products of S°, its dual S. and the complex 
conjugates of these two spaces will be called a spinor field on ™. A world- 
vector field is thus seen to be a special case. Here, “‘tensor product”’ will 
be interpreted in the sense given in Section 3. The labeling system will be 
very useful in order to keep track of the various operations. 
We shall need spinor labels of two types: 


ABCs i BM cs Aus bac (4.19) 
ya: a Gee eee” eee (4.20) 


For each weS°, there will correspond elements w4e¢S4, w?€S%, ..., 
w* e S*, ... and also elements @4 € S4,, 08 €S?, ..., @* ES*,.... As 
before, we can, if desired, think of w* as being simply the pair (w, X), etc. 
The ring S will be identified as the complex C®-functions on #: 


S=TQiT (4.21) 


The difference between the primed and unprimed labels arises because we 
shall require that, whereas each of the modules S4, S*, ... over S is canonic- 
ally isomorphic with S* , the modules S4’, S*,, ... over S are each canonically 
anti-isomorphic withS*. Thatis, if A, u¢S anda, B,y eS°,with y =e + yB, 
then 


yX=doX+up* and 7p =a* + pp (4.22) 


We shall, in fact, regard each @* as being the complex conjugate of the 
corresponding w~ and occasionally write 


wk = Os 7% = 2% (4.23) 
(We may think of the complex conjugation operation as effecting an inter- 
Change of the list (4.19) with the list (4.20), when applied to the labels them- 
selves.) Our spinor system will then be built up just as in Section 3, but 


with the proviso that contractions and index substitutions can occur only 
between index labels of the same type (that is, only between primed or only 


146 R. PENROSE 


between unprimed labels). Since index substitutions do not effect inter- 
changes of primed index labels with unprimed ones, we shall regard the 
relative ordering of primed and unprimed indices as irrelevant. On the other 
hand, it will be significant to maintain an ordering (in most cases) between 
indices of the same type even when some are upper indices and others are 
lower indices. Thus, for example, 


AA’ A'A __ AA’ AA’ 
p oe 0 Pg By # P Son (4.24) 


Complex conjugation gives us a fifth operation on the spinor system {S}, 
in addition to (3.11)-(3.14): 
complex conjugation : Sp 2.$9° ys > Sg ups (4.25) 

defined on each element p*"7> 2. §-...y- [expanded as in (3.9)] in the 
obvious way [cf. (4.23)]. 

So far, we have not incorporated into the spinor algebra, the fact that 
the (7",,) matrices have to be unimodular. This unimodular condition can 
be written alternatively as 


aCe __ tu re BD 
nie ae = 


E a: OF tan ST gl seve (4.26) 


Here, the Levi—Civita symbols 
0 1 
(exe) =()=(_ 7 (4.27) 


are being used. Let us denote by ¢,, and e4” the elements of S,, and S42, 
respectively, whose components in some basis are given by (4.27). Then 
(4.26) tells us that €,, and e4° are invariant under the active symmetry trans- 
formation of S*(P) given by (4.18). We shall regard ¢,,, 64% and their 
complex conjugates ¢,',, «4% as an essential part of the internal structure 
of the spinor system. (It would have been more logical to write é,.,- for 
€4’,- and &4 ® for e4 8, but these bars are normally omitted.) Any basis for 
which the components éyy, have the form (4.27) will be called a normalized 
basis or spin frame. 

One effect of the presence of the e’s is that they establish a canonical 
isomorphism between S° and its dual S,. This is achieved via the lowering 
and raising of spinor index labels: 


Cee a gre Se Se (4.28) 
(HO eee. Be Re a ee (4.29) 
(In view of this, it may be felt that it is redundant to use both lower and upper 
labels. The reason for maintaining the distinction is essentially a matter 


of “bookkeeping.” In fact, a notation is possible which does not maintain 
this distinction, but it looks a little strange since spin vectors must then 


Structure of Space-Time 147 


anticommute! I have chosen to be conventional in this respect.) Note thaf 
Cy ae (4.30) 

(and similarly for primed indices) because of the antisymmetry: 
Ean = —Epa en = —E (4.31) 


SO we must be careful about orderings when we raise and lower indices. For 
this reason, and so as to avoid notational confusion at a later stage, the 
symbols 54 and 6%., corresponding to (3.18) will not be used here. Instead, 


the equivalent symbols €,4, €,-4 will be employed. We have 
Eype = 8, = —s, 8,422 (4.32) 
gee Hee Os 31 MD APT a Ph (4.33) 


so that e,4 has the properties (3.19) required of a “‘6,.” (Note that the 
raising and lowering operations are consistent.) With any choice of spin 


frame &,4, ¢,” we have 
1 0 
(c0*) = ( 0 ) (4.34) 


If a spinor is skew-symmetrical in three or more indices then it vanishes 
(from two-dimensionality). Hence, 


EapEcp + EgcEps t+ EgnEsc = O (4.35) 


One consequence of (4.35) is that if C...4...,... 1s Skew-symmetrical in A 
and B: 


ee ee Bact Cpu ig (4.36) 
then 


re ee = 0 a ee (4.37) 


This property is closely related to the representation theory of SL(2, C), 
since it indicates that it is only spinors totally symmetrical in all unprimed 
indices and in all primed indices which cannot be reduced into simpler parts. 
The components of a spinor with r symmetric lower unprimed indices and 
with s symmetric lower primed indices are said to transform according to the 
D(r/2,s/2) irreducible representation of the Lorentz group. 

Let us now incorporate the world-tensors into our spinor scheme, 
introducing world-tensor labels (3.1) by identifying them as pairs of spinor 
labels as follows: 


a = AA’, b = BB’,c = CC’,..., 49 = Ay Ao’, --- (4.38) 
This gives us the subsystem (S, S*, S’, ..., S,, ..., S32z%, ...) of complex 


world tensors, closed under the operations (3.11)-(3.14) and also under the 
complex conjugation operation (4.25), which here acts: S377%, > Sii2y. 


148 R. PENROSE 


The elements which are invariant under complex conjugation are the real 


x 


world-tensors: 7%, = 07.7%. The real world-tensor fields from a 
subsystem {T} of {S}: 
STS 1 see oie 1 hse) (4.39) 
To see that this process is really what ts involved in (4.13), let us 
consider a spin frame e,,4, €,2 for S* and define (with ce," = €,”) 


l , 
5° = — (e,°s,° A By eye" ) 
/ 
V 
. | | 0) 0’ | 1’ 
Oa = A Eq, — E4 Eg ) 
(4.40) 
I 
6; = — (eq'eg” + €4°€4"' ) 
2 
i ; ; 
63 = — (e484 — £4°€4' ) 
<2 
This is readily seen to give (4.13), where u* = u4* and 
PSO: a ule ey (4.41) 


as is required by the convention (3.20). The definitions (4.40) give us a 
standard way of obtaining a world-vector basis from any spin frame. How- 
ever, it is important that we not be tied down to this particular correspondence 
in all cases. Sometimes we shall prefer to obtain our world-vector basis 
as the one naturally obtained from a coordinate system. In other circum- 
stances the null-tetrad 


66 = 85484 9 =6,46,4 68 == 64%e,4 586,484 (4.42) 
is a convenient basis to use. In each case, we can define the Infeld—van der 
Waerden translation symbols [46, 110]: 


WW sa, A w a _ sa. A, A’ 
a = O98, Ey: Og = Og Eq’ Eq: (4.43) 


(Recall that a = AA’, so a contraction is taking place between the abstract 
labels. On the other hand, no connection is to be implied between a, and 
QM, Wl’, so no summation Is involved. Each of (4.43) stands for 16 equations.) 
In fact, the Infeld-van der Waerden symbols are really just 62, but with the 
upper and lower indices expressed according to different kinds of bases. 
The relation between spinor components and world-tensor components is 
then exemplified by 


Ya = Wan Fe Smeg = We Cay I (4.44) 
Now, consider twice the determinant of (4.13): 


1 Pe goras Erg: = U°U"G,, (4.45) 


‘o 


Structure of Space-Time 149 


This becomes 
u*u(E4p Ex'B' — Jar) = 9 (4.46) 


on translation back to the abstract labels [cf. (3.21)], so [because of the 
symmetry of g,, and of €4,,&€,4-,- under interchange of a, b and the fact that 
(4.46) holds for all u*] we get 


Jab = ©4Bea'B’ (4.47) 


Equation (4.47) expresses the fundamental way in which the metric of space- 
time is defined by its spinor structure. Note, tn particular, that the (+, —, 
—, —) signature is built into the formalism by the identification (4.47), because 
with the choice of world-vector basis (4.40), we get (4.15). As a consequence 
of (4.47), (4.32), (4.31) we have 


g”” =e ee Oe = 4 E4” Jab = Goa J ab g’* = 04 (4.48) 


so the spinor raising and lowering conventions yield 
Cee Ue. 2 eae (4.49) 
Thus, the notation is consistent with the usual conventions. 
So far, nothing has been said to relate the spinor structure to the 
manifold .@ as a whole. In order to connect the various S*(P)’s to each 
other, it will be necessary to consider differentiation. Let us define 


axiomatically the various rules we shall need. We require an operation of 
covariant differentiation: 


Vii Space SP Md oe (4.50) 
satisfying 

Vyy (Wi +) = Vay Wl + Vy 0 (4.51) 
VyyW 2D = Vay Wo EW ye (4.52) 
Vyy Emp = O 
Vyy 648 =0 (4.53) 
Wi = Vyy x IMPLIES wo = Ver Xn (4.54) 
Wo AgaVay 4g IMPLIES WAG = Vay Ay (4.55) 
Vy y’ COMMUTES WITH ANY INDEX SUBSTITUTION (NOT INVOLVING 
Xor Y’) (4.56) 
V.V.¢=V.V,¢ (4.57) 


FOR ANY DERIVATION & ON T, THERE EXISTS A UNIQUE ELEMENT 
€° € T° SUCH THAT &(¢) = €°V, ¢, FOR EACH ¢eE T (4.58) 


150 R. PENROSE 


Actually, the axioms (4.51)-(4.58) are not quite independent of one 
another, but they express all the properties that we require. Note, that it ts 
(4.58) which connects {S} with the tangent bundle of .7. The world- 
vectors become tangent vectors to.#@. The operator Vy y- 1s uniquely defined 
by all these properties. (Clearly, for any other pair of labels M, N’, say, 
we can also define Vyy- by index substitution.) Conversely, we may ask 
whether the existence of a system {S} satisfying all the requirements we have 
imposed implies any restriction on M as a differentiable manifold, or as a 
pseudo-Riemannian manifold with given (+, —, —, —) metric (with which 
the g,, of (4.47) is supposed to agree). The answer is that /ocally no such 
restriction is implied (that is an {S} exists for some open set containing any 
given P € .@) whereas globally there are such restrictions. A similar remark 
applies to the uniqueness of {S}. These global restrictions on .@ will enter 
into the discussion in the next section. 


5 THE INTERPRETATION OF A SPIN-VECTOR 


In the previous section, spinors were introduced completely formally. 
Only the elements of {T} were given a direct interpretation in space-time 
terms. One might imagine that any geometrical interpretation of the re- 
maining elements of {S} would necessarily be very indirect. However, this 
turns out not to be the case. Any spin-vector has, in fact, a very graphic 
space-time interpretation up to sign. Even the sign can be given a meaning 
in terms of physical constructions. 

Let us try to interpret the spin-vector 


we S“(P) 
The most obvious world-tensor we can build from w4 is the world-vector 
u? = wr@* (5.1) 
Now u*u, =| ww ,|? =0 (since w4w, = —w,w4 = 0), so u* points along 


the null cone at P. In fact, every real null world-vector has either the form 


(5.1) or the form 


v= —w4t@* (5.2) 
and every complex world-vector, null in the sense w*w, = 0, has the form 
wt = wr” (5.3) 


These facts follow at once from the representation (4.13), since a (2 x 2) 
matrix of vanishing determinant has rank at most unity and is therefore the 
outer product of two vectors. Note that the existence of the spinor system 
{S} gives us an absolute distinction between the two null half-cones at P. We 


Structure of Space-Time 151 


can define the future-pointing null vectors to be those for which (5.1) holds 
(with w* 4 0) and the past-pointing null vectors to be those for which (5.2) 
holds (w4 40). (This is consistent with (4.13) if positive u° corresponds to 
time increasing.) This gives us the first global restriction on ./#/, as a pseudo- 
Riemannian manifold, implied by the existence of {S}: ./@ must be fime- 
orientable. That is to say, the division of the null half-cones of ./ into two 
classes, ‘““future’’ and “‘past,’’ can be made continuously over the whole 
manifold. 

Equation (5.1) tells us, then, that a future-pointing real null world- 
vector is uniquely defined by any nonzero spin-vector. However, many 
spin-vectors will correspond to the same null world-vector since (5.1) is 
invariant under 


ww — ei%qyA (5.4) 


(9 real). To interpret the ‘‘ phase’’ of w* we have to try something different. 
Let us “square” w4 instead: w4w?. Then, to introduce as many primed 
indices as unprimed, we multiply by e4 8. Finally, to get a real world-tensor 


we add the complex conjugate: 
p= wtweet ® + 484 o® (5.5) 
Now p” is readily seen to be skew and simple: 
p® = yk? — ky? (5.6) 
where 
k? = wtK* + K40* (5.7) 


Kk“ being any element of S4(p) subject to w,K* = 1, whence (by (4.37)) 
w4K® — K4qm? = 648, The vector k* is a real, spacelike of length 2 and 
orthogonal to u’*: 


ko=k* k*k,=—-2 ku, =0 (5.8) 


It is defined by p® up to the addition of a (real) multiple of u*.. The directions 
k* therefore define a half-plane element at P (two-dimensional) which is 
tangent to the null cone along the null direction of u*. Thus w% defines 
a kind of null flag [70, 73, 114] in the tangent space at P, whose flagpole is the 
future-pointing null vector u* and whose flag plane is this null 2-plane 
through u’ (Fig. 7). 

To get a better picture of the nature of this flag plane, let us take a 
section in the tangent space at P by a spacelike hyperplane not through P. 
This cuts the cone in a sphere S*. The flagpole intersects S? in a point U 
and the flag plane gives us a unit tangent vector to S* at U (which we may 
identify with that choice of 2~'/?k* which lies parallel to our hyperplane). 
We can get a direct physical interpretation of this $7 as the ‘‘ celestial sphere ”’ 


152 R. PENROSE 


S? 


FIGURE 7. The spin vector w4 at P defines a null flag. This may be pictured as 
a tangent vector at a point U on the celestial sphere. 


(or field of vision) of an observer at P. (We can choose our spacelike 
hyperplane orthogonal to the observer’s world line at P.) A photon arriving 
at P has a world line (a null geodesic) which has a null tangent at P and is 
therefore uniquely associated with a point U of S*.. We may also describe 
the polarization of the photon in terms of a flag plane, that is, as a tangent 
vector at U to the celestial sphere. Thus, a spin vector emerges as a natural 
object for the description of a polarized photon. This relation between 
spinors and zero rest-mass particle fields (of various spins) will be discussed 
in more detail in Section 8. 

One final remark concerning the structure of the celestial sphere is 
appropriate here. The Riemannian metric induced on S? by the above 
construction will depend on our choice of spacelike hyperplane, but the 
conformal structure of S? is, on the other hand, well-defined. One way of 
seeing this is to refer to (4.18), which when applied to € = w°/w' yields the 
fractional linear transformation (> (1°) + 1°,)/(t'9¢ + t',). The complex 
number ¢ defines the point U, the complex ¢-plane being actually related 
to S? by a stereographic projection [71]. Both these transformations are 
conformal (and, in fact, send circles into circles). Since (4.18) corresponds 
to a Lorentz transformation, it follows that the celestial spheres of two 
observers at P are related to each other by a conformal transformation [71], 
106] (and a circular pattern seen by one observer will also appear as circular 
to the other). Another way of invariantly obtaining the conformal structure 
of S? will emerge in a moment. 

If we multiply w* by a nonzero complex number A = re’? (r, 0 real), then 
u* gets multiplied by r? and (taking x4 > 17!) k* gets rotated through an 
angle 20 (in the plane spanned by k* and h* = iw4k* —ix4@“). Thus, we 
see that under w4 > Aw“, the extent of the flagpole is multiplied by a factor AA 
and the flag plane rotates through an angle 2arg A. Note that this gives a 
direct way of defining the conformal structure of our S? in terms of spinors. 
If U is a point of $7, then the angle between the two tangent vectors to S$? 


Structure of Space-Time 153 


at U given by w“ and e'*w4, respectively, is simply 20. Having an invariant 
definition of angles on S? we therefore have an invariantly defined conformal 
structure for S?._ Note also that we have an invariantly defined orientation 
for S? and therefore also for .@. We can define the notion of right-handed- 
ness by simply specifying that “‘ right-handed ” is the sense in which the flag 
plane of e'°w* rotates as 6 increases. (This is consistent with the reference 
frame referred to in (4.13) being a right-hand one.) This gives us our second 
global restriction on .@ as a pseudo-Riemannian manifold: .#% must, in 
addition to being time-orientable, also be space-orientable. As a topological 
manifold, therefore, .Z is orientable. 

The third and final global restriction on .@ as a pseudo-Riemannian 
(or topological) manifold, which is implied by the presence of {S}, arises in 
relation to another feature of the above representation of a spin-vector as a 
null flag. We note that both the wv‘ of (5.1) and the p® of (5.6) are invariant 
under 


w+ —w4 (5.9) 


Explicitly, if we consider the flag plane given by e'®w4 as 0 varies from 0 to z, 
we see that it executes one complete rotation through 272 and returns to its 
starting position, whereas w* becomes replaced by its negative. Only after 
a further complete rotation of the flag plane through 27 will the spin vector 
return to its original value. It is clear, moreover, that no additional local 
geometric structure added to the null flag, can distinguish w4 from —w4 
(interpreting ‘‘local geometric structure” to refer to structure in the tangent 
space to .@ at P). For, a rotation of any such structure through 22 would 
return to its original configuration, whereas w* would undergo (5.9). 

On the other hand, a nonlocal geometrical distinction between w* and 
—qw* is possible. To obtain this, we consider a particular null flag at a 
particular point 0 of .@. Call this the standard flag. For simplicity, let us 
restrict our attention to a simply-connected region of ./ (or else work with 
the universal covering manifold of .@). The idea, then, ts to identify a 
spin vector, not just with a null flag, but as an equivalence class of flag paths.'* 
A flag path is a continuous sequence of null flags on ./ which start with the 
standard flag and end with some null flag at a point P of ./ (Fig. 8). Two 
flag paths are equivalent if they can be continuously distorted one into the 
other, keeping the end-points fixed, and the null flags at the end-points fixed. 
The existence of spin vectors now depends on the existence of fio equivalence 
Classes corresponding to each null flag at P. If we consider flag paths based 
On a fixed curve on .W and ending at a fixed null flag, then it follows from the 


18 T have been deliberately somewhat vague as to whether the geometrical structure 
Corresponding to the zero spin vector at P should be called a “ null flag..” However, in the 
Case of a flag path, the ‘‘zero null flags’”’ are to be definitely excluded. 


154 R. PENROSE 


spin-vector bundle 


S*(O) 


null-flag bundle .F 


Space-time ./ 


standard flag 
a“ 


FIGURE 8. Flag paths in -@ and their images in F and %. 


topology of the Lorentz group that there are precisely two equivalence classes 
of flag paths associated with the curve, which are related to each other by a 
net relative rotation through 22 between 0 and P._ But if we allow the curve, 
on which the flag path is based, to vary over the manifold and then return to 
its original position, it may be that these equivalence classes become united 
inside a single equivalence class. Then spin-vectors could not be defined on 
the manifold. Another way of stating this property arises if we allow P to 
coincide with 0. If the (closed) flag path consisting simply of a rotation 
through 22 of the standard flag about its flagpole can be continuously dis- 
torted, by moving the curve about the manifold, into the constant flag path 
(that is, no motion of the null flag at all), then a spinor structure does not 
exist. 

The above construction clearly actually refers to the null-flag bundle 
F of M (an 8-dimensional space of all null flags at all points of .@). A flag 


Structure of Space-Time 155 


path corresponds to a curve in ¥. If .@ is simply-connected, the above 
construction gives the spin-vector bundle (space of all spin vectors at all 
points of .#) as the universal (twofold) covering space of * —provided that F 
possesses a twofold covering. If #F does not possess a twofold covering 
(that is, is simply-connected) then {S} does not exist for .”. When {S} 
does exist for .#/, then we can interpret the elements of S° as the C® cross 
sections of . When .@ itself is not simply-connected the situation becomes 
more complicated because flag paths belonging to nonhomotopic curves on 
Md cannot be compared. The spinor structure {S} for ./%, if it exists, 
becomes nonunique. But in any case, for the existence of a spinor structure 
we can refer to the universal covering manifold of .# and the situation is the 
same as before. 

The foregoing discussion has been given in terms of the bundle F of 
null-flags since this seems to be the most direct geometrical route to the 
space-time description of a spin-vector. It is more usual to use the bundle of 
all orthonormal frames on .@, however, and this is more appropriate for the 
general n-dimensional discussion. The two procedures are readily seen to be 
equivalent for a space-time manifold. Also, since ./ is orientable (that is 
vanishing of Steifel-Whitney class w,), it is not difficult to adapt the above 
arguments to show that the remaining topological condition on .@ implied 
by the existence of {S} is that its Sreifel-Whitney class w, should vanish 
(38, 57]. This means that for every 2-surface in M, we can put a set of three 
linearly independent vectors at each point of the surface in a continuous way. 
(See [113a] and Chs. V and XVII by Lichnerowicz and Bott for discus- 
sions of this.) It is of interest that “‘ plausible’? space-time manifolds .@ do 
exist’? which are space- and time-orientable and which contain no closed (or 
almost closed) timelike curves, but for which w,#0. For such .@, the 
system {S} could not exist. 

We may ask what are the physical grounds for believing in the existence 
of a spinor structure for space-time. How much physical reality can one 
indeed assign to an object which is not returned to its original state when 
a rotation through 360° is applied to it? The idea seems totally foreign to 


19 Take a complex projective plane 7 (a real 4-manifold) and choose a (positive 
definite) C” Riemannian metric on-¥%. Let ¢ be areal C~ function on -7 with only tsolated 
Critical points. Remove these points from the manifold and use the unit vector cor- 
responding to the gradient of ¢ to construct a C” Lorentzian (}, —, —-, --) pseudo- 
Riemannian metric. The resulting ‘‘space-time’’ is time-orientable, space-orientable 
without closed (that is, ~.S?) timelike curves [in fact, satisfying strong causality (11.1)], 
but possesses no spin structure (w, #0). I am grateful to R. Bott and S. Smale for this 
example. 

Added in proof: In a recent article (submitted to J. Math. Phys.), Geroch has shown 
that for a non-compact space-time, existence of a spinor structure is equivalent to global 
existence of a continuous field of orthonormal tetrads. He also shows that our assumption 
(4.5) is, in fact, redundant. 


156 R. PENROSE 


our experiences. Nevertheless, the wave functions of electrons, protons, 
neutrons, neutrinos, and many other particles do in fact behave in this way. 
But we might well argue that this is not conclusive, because a wave function 
is a somewhat nebulous concept. In any case, surely it is only the overall 
sign of the wave function which Is altered upon rotation through 2z and Is not 
the sign of a wave function supposed to be unobservable? In an interesting 
recent article, Aharonov and Susskind [Il] argue that this view cannot be 
maintained. They show how it is possible, in principle, to construct an 
apparatus which, depending precisely on this spinor nature of the electron 
wave equation, exhibits such behavior under rotation on a macroscopic scale. 
The apparatus consists of two parts which, when fitted together, result in a 
detectable electric current flowing from one part to the other. The parts are 
then separated and one is rotated through 27 relative to the other. They 
are again fitted together in precisely the same relative orientation as before. 
A detectable current again flows from one part to the other, but now the 
direction of flow is reversed! Only by separating the parts and applying a 
further rotation, through 27, of one part relative to the other, is the original 
direction of flow restored. Thus we have a situation in which the geometry 
of the apparatus, in the usual sense, does not define its behavior. In this 
case, by widening the concept of geometry to include spinorial quantities, 
we can again think of the geometrical configuration as determining the 
behavior. 

The Aharonov-Susskind apparatus gives us a “physical”? way of 
picturing a spin-vector. We may think of one part of the apparatus as being 
attached to our standard flag and allow the other part to be attached to 
another null flag which moves about the space-time. Then the apparatus 
keeps track of the parity of the number of complete relative rotations which 
take place, so in effect we have a spin-vector. If it could be convincingly 
argued that such pieces of apparatus could (in principle) retain their “‘ memory ”’ 
over large (say cosmological) distances and times, then we should have a 
fairly conclusive argument that 1, for the universe really does vanish! 


6 EXPLICIT CURVATURE FORMULAS 


The notation of Section 3 yields, in an automatic way, all the familiar 
basic formulas involving the curvature, Christoffel symbols, and Ricci 
rotation coefficients, once certain basic definitions are given. This will be 
indicated here, in conjunction with a parallel description of the various spinor 
formulas, based on the notation of Section 4. 

Consider a basis 6% for T* and a (possibly unrelated) basis ¢,4 for S*. 
Let 6°, €,” be the corresponding dual bases. Define 


rs, = 5° V,62 = —d¢V. 62 (6.1) 


Structure of Space-Time 157 
and 
= bin Venta Sta V. (6.2) 
YaBed: AB VY eD fy B Vor ou 


[Here V, = O<V. and Voy: = &oSEy-? Vep, in accordance with (3.20). The 
equivalence of the two formulas in (6.1) and in (6.2) is an immediate con- 
sequence of the rules (4.51)-(4.56), (3.17), (4.30), (4.31).] The quantities 
r*,, specialize either to Ricci rotation coefficients [in the case when the basis 
frame is chosen so that g,, has a specific form, for example (4.15)] or to 
Christoffel symbols (when the basis is that naturally derived from a coordinate 
system). The quantities jygep, are called spin coefficients [64]. As an 
example, let us derive the usual Christoffel symbol symmetry. Let x°,..., x? 
be a local coordinate system on .#. Then the associated coordinate basis 1s 
given by 


05 = V,x° (6.3) 
with 6° dual to this. Thus we have, in this case, 
ry. = —d¢V.V,x° = —676¢V,V,x° (6.4) 


which is manifestly symmetric in cb because of the vanishing torsion con- 
dition (4.57). Also the spin coefficient symmetry 


Yaser = Yeacn’ (6.5) 
is immediate from (6.2). 
The formulas (6.1) and (6.2), together with (3.20), (3.21) yield the usual 


formulas for covariant differentiation in terms of components. This 1s 
exemplified by 


5¢ 5p Oa(Veb%4) = 5p 5g Ve(S". Oy 55) 
VG ewer (6.6) 
Bg°Ep” EyEg (Ven 4") = by “eg Ven (Og" En € 5 ) 
= Ven Oy” — 06° Ten: + Oy9 TP ga (6.7) 


(Recall that when acting on quantities without abstract labels, V.. is just 
the ordinary gradient operator.) 
Let us now define the curvature tensor and curvature spinors. Put 


R yea = 205 VicW a] OF (6.8) 
and, defining 
Oss = Vy-(aVB) Oae = Vy (a'VB’) (6.9) 
set 
‘PaBcD = Epp C)(4B Ec)" (6.10) 
Depa = Epp One Ec (6.11) 


A = }e4y 0785" (6.12) 


158 R. PENROSE 


(Here the conventions are being employed that: square brackets around 
indices denote antisymmetrization and round brackets around indices denote 
symmetrization, for example, 045) = $(“a5 — %ba)> Brabey = b(Babe + Bea + 


Beas — Bacb — Beba — Brac)» (AB) = 3(Yan + Yea)> etc.) Since (6.8), (6.10)- 
(6.12) involve German indices on the right-hand sides we have to show that 


the expressions defined are, in fact, basis-independent. This follows because 
the operators V,.Va,}, Olas Ola-p: all satisfy 


HS +n) = Dor + Dy 
7 (Si SO al Se, Sa a, (6.13) 
Go =0 
[cf. (6.16)], where Z symbolically stands for any of these operators. To 
make a basis change in (6.8), (6.10)-(6.12) we make substitutions: 
6, > T°, O; 63> TU, 58 (6.14) 
EG ot gtc® Epp > th Sy ep, (6.15) 


where (—1) denotes matrix inverse. By virtue of (6.13), the transforming 
matrix carries through the derivatives in each case and cancels with its inverse 
to leave the expression unchanged. (Note that abstract labels do not 
“transform ’”’!) 

The relation between these operators is given by 


2V V5) = Earp Clap t+ 4p ae: (6.16) 


so, for example, O45 = &” V,aVs;- We can combine (6.10) and (6.12) 
in the form 


Epp Lap Ec = Pasco — 2AEp (4EB)C (6.17) 
Also, (6.17) combines with (6.11) to give (via (6.16)) 
2Ep» ViaV5} Ec = ‘Pasco Eas’ — 2AEp AEB)C Eqrp + Depap Exp (6.18) 


In fact, Papcp, A, and ®, pcp are the parts [72, 115] of R,-4 irreducible under 
local SL(2, C) (Lorentz) transformations: 


Rabea = VY ascp€a'B’ Ec’ + EaBEcp Varpcrp + 2A{Eac Epp Ea'B’ Ec'D’ 
+ EgpEcp Egy Epc} + Eas Pcpa'B Ec'p’ + Ecp Papc'p Ea'B’ (6.19) 


Equation (6.19) follows by two applications of (6.18) [using (4.35)] to the 
definition (6.8). The symmetries 


Roba $e R aap a Reabyted) Rarbcd} = 0 (6.20) 


find expression in 


Y ABCD = Y (ABCD) A=A D4ac'p: = D(4B)(C'D’) = ® scp’ (6.21) 


Structure of Space-Time 159 


so that Papcp, A, and ©, 5p: belong, respectively, to a D(2, 0), a D(0, 0) and 
a D(1, 1) irreducible representation space for the local Lorentz group. 

In tensor terms, the reduction of R,,.4 into irreducible parts is somewhat 
more complicated. We can define the We)/ (conformal) tensor by 


Carea = Vasc Ea'e Ec'y: + Ea Ec V a'B'c’D’ (6.22) 

Purely tensorially, this is given by 
Co, = Re. = 2R", 64} + 4R6, 505 (6.23) 
Rap = Rasp R= R*, (6.24) 


Then, in addition to possessing all the Riemann tensor symmetries (6.20), 
Crpca Satisfies 


C= 0 (6.25) 
The C,-4 is not yet irreducible (if we allow ourselves to consider complex 
tensors) and splits down further into its so-called self-dual and anti-self- 
dual parts: Cyy.4 = Cosea + Capea, Where Cru = apcp Esp Ec’p’- (‘‘ Self- 
dual” means, here, $Cj).y (@cu"” = Cyycas Where —@gy,p is the skew-symmetric 
Levi-Civita symbol in a right-handed orthonormal frame. In spinor terms 
Cabed = HE acEppEa'p’ Epc’ —lEapEpcEa'c’ Epp)» The remaining irreducible 
parts of Rosca are 
Ray — 4RGan = — 20 45a'e: (6.26) 
and 
R = 24A (6.27) 


As a verification that the definition (6.8), of the Riemann tensor, 
actually agrees with the usual one, wecan derive the Ricci identities. Consider, 
for example, €°,. Then, 


2Via V5) E"4 = 2Vira Vii(S‘o Og Of) 


= ae R* sab ~ ona R* ab (6.28) 
by (6.13) and the fact [which also follows from (6.13) applied to (6.8)] that 
R' sca = — 205V,-Va}0q- In an exactly similar way we obtain the correspond- 


ing spinor Ricci identities, taking as an example 
Clare 9c” = Dare (9e6° en” Ec*) 


— py Xp’ D’ D'@X 
= 06 (PB yrara — 2NE x arEpy -) — Ox O' carp. 


(6.29) 


We can also directly obtain the expression for the components R,,,, in terms 
of T%,, by applying (6.1) to (6.8) [with (3.20)]. This includes both the 
familiar expression in terms of Christoffel symbols and the Ricci rotation 
coefficient curvature formula. Somewhat less familiar are the corresponding 


160 R. PENROSE 


formulas which relate the spin coefficients to the components of the spinor 
curvature quantities. These will be given explicitly here, since we shall be 
interested in two particular cases later. Let us use (6.18) since this gives all 
the curvature quantities at once. The result is [64]: 


— pP2 
Ve’ Yucns’ — Voy: Yacse’ = {yupps’ Yacse’ + Yucps’ Yarse’ ~ Yapye’ acre’ 
RS’ 
— Jucpe’ Yaxrs’ + & {yucrn’ Ye seg = Yucen’ Yeres'n} 


+ Paorg fee: + Abe (Een Eng + Exp Ex) + Dycw-e Exn (6.30) 


9. 


In this connection, we note also the commutator of two “‘ intrinsic derivatives ”’: 
Q 
{Vass Veo: — Veo Vas}? = {e* (Yguca’ Vow’ — Ypous Van’) 


+ "© (Yarn Vas — Yoro-ea Vee) (6.31) 


7 EINSTEIN’S EQUATIONS AND FOCUSING 


The decomposition (6.19) of the Riemann tensor into its irreducible 
spinorial parts allows us to discuss the structure of space-time curvature, as is 
implied by Einstein’s field equations, in some detail. Einstein’s equations 


are 
Ra = tRGa, + AGar Se =2KT a, (7.1) 


where A and K are real constants with K > 0 and where T7,, is the local stress— 
energy-momentum tensor (the “‘energy tensor’’) of matter. Here, the term 
‘““matter”’ refers to any field except gravitation itself. Thus, for example, 
free electromagnetic field counts as ‘‘ matter”’ in this context. The question 
of the energy which must be assigned to gravitation itse/fis a more subtle one. 
This will be discussed a little more later. Gravitational energy cannot be 
adequately defined in a /ocal way and emerges, instead, as some kind of non- 
local quantity. The local gravitational energy must apparently be thought of 
as zero, and this is consistent with (7.1). The cosmological term / is gener- 
ally put equal to zero, since there are no compelling theoretical or observa- 
tional grounds for believing in its existence. In any case, A would have to be 
extremely small, of the order of the reciprocal of the square of the radius of 
the universe. 

As it stands, (7.1) implies no restriction on the curvature, since no 
restriction has yet been placed on 7,,. But we can study (7.1) in three 
different types of situation. In the first instance, we may be interested in 
pure gravitation theory for which T,, = 0. Then (7.1) becomes a well-defined 
set of partial differential nonlinear equations on the g,,, when expressed in 
terms of a coordinate system x°. These equations are difficult to solve 
explicitly and only a very few solutions are known which have any direct 
relevance to physical situations. (Some of these will be given in later 


Structure of Space-Time 161 


sections.) On the other hand, there are certain general statements which 
can be made about the nature (in particular the asymptotic nature, if suitable 
boundary conditions are imposed) of the solutions. Now, secondly, we may 
be interested in solutions of (7.1) for which the 7,,, though nonzero, is also 
subject to certain conditions (“equations of state’’) which are normally 
again partial differential equations, but which now refer to the field quantities 
which contribute to 7,,. (For example, in the case of the ‘‘ Einstein— 
Maxwell’ equations, 7,, 1s specified as being the energy tensor of a free 
electromagnetic field which is subject to Maxwell’s source-free equations in 
covariant form.) Again the equations become virtually impossible to solve 
explicitly, in general, for realistic equations of state. Thirdly, we may be 
content to make general statements about solutions of (7.1) for which we 
merely impose inequalities on the T,,. Such an inequality might state, for 
example, the positive definite nature of the local energy density, or some other 
such physically reasonable requirement. In many ways, this type of situation 
is the easiest to handle and the last two sections will be devoted mainly to 
the treatment of space-times under such conditions. Finally, it may be 
remarked that one way of attempting to treat (7.1) which strictly speaking is 
not valid, is to regard 7. as some given source distribution for which we try 
to construct its gravitational field. This is a normal procedure in other 
branches of physics, but here, until we have our metric g,, it 1s not even clear 
what is meant by a given source distribution. 

If we now decompose R,,,4 into its spinorial parts and use (6.26) and 
(6.27), we get 


KTq, = Dagar + (3A — 3A) AB Ene’ (7.2) 
so ®,, represents the trace-free part of 7,, and A the trace: 
®,, oa K(Tap i 4 Ty"Gab) A= 5'3(K T,,* + 2A) (7.3) 


The part of R,,.4 which, locally, is left completely undefined by T,, is the 
Weyl! tensor C,,.4, OF, equivalently, the spinor ‘W4gcp.- We may thus think 
of Wascp as representing the free gravitational part of the curvature. In 
regions free of matter we can still have curvature. According to Einstein’s 
equations, then, such curvature will be of a particular type, defined com- 
pletely, in structure, by a totally symmetric spinor 'Pagcp. If 4 = 0, then we 
shall have, in fact, Rosca = Cabea [With Capea given by (6.22)]. Let us, there- 
fore, examine the nature of ‘¥,scp. Spinor methods are particularly well 
suited to this problem, the corresponding discussion of the Weyl tensor 
Circa DEINg far less transparent. 

Consider the form Wyner CACPESE*, where for simplicity we choose 
€° =|, ¢€!=7, This is a quartic polynomial in z with complex coefficients 
and therefore factorizes into four linear factors: 


Posen Sooo = (Hy E")( Bas E*V(¥¢ E“\(Op g*) (7.4) 


162 R. PENROSE 


Equating coefficients [and eliminating the basis frame cf. (3.21)] we get 
Vascp = ABB Yc Op) (7.5) 


The factorization is unique up to a complex multiplier for each of a,,..., 
Op. Thus, the null direction defined by each of a,, ..., dp is uniquely 
defined, so we see that ‘Y,pcp defines an unordered set of four (possibly 
coincident) directions along the null cone at each point P of M (at which 
WY apcp ¥ 9) called the gravitational principal null directions [23, 72,91] at P. 
The coincidence schemes for the principal null directions at P can be 
represented as follows: 


UN iy 
{22} gpa BU (7.6) 
ae 

Petrov type: I II Ill 


where the arrows point in the direction of increasing specialization. Here, 
{1111} represents the general case when the null directions are distinct; 
{211} is the case when there is one double null direction and two distinct 
simple ones; etc. The symbol {—} denotes the case Pypc¢p =0 at P. The 
‘Petrov type,” referring to the columns in (7.6), is a more direct classifica- 
tion [80b, 81] of C,,,.4 according to the dimension of the space spanned by 
eigenvectors of the matrix (C°”.,) (grouping a,b together and c,d together). 
The case {4}, when all the principal null directions coincide, is called the null 
case; {22} is called degenerate; all cases but {1111} are called algebraically 
special. In terms of C,,.4, we can write a necessary and sufficient condition 
[23,93] on a null vector /% for it to point in a principal null direction as 


laC bcd teyy Ici? = 0 (7.7) 


The classification scheme (7.6) plays a significant part in the under- 
standing of the geometric structure of gravitational fields. Perhaps the most 
directly physical role of the principal null directions emerges in the “‘ peeling- 
off’ property of Sachs [64, 76, 93-95]. This states that for an asymptotically 
flat space-time which is empty near infinity (definitions in Section 8), then 
along any null geodesic y, the curvature tensor exhibits a certain characteristic 
asymptotic behavior. Let r be an affine parameter on y. Then, in a well- 
defined sense, the curvature falls off along y in such a way that to order r~! 
it is null, with the quadruple principle null direction pointing along y; to 
order r-*, there is a triple principal null direction along y; to order r~?, a 
double one; to order r~*, a simple one; and to order r~> the curvature is 
unrelated to y (Fig. 9). This is the generic situation and more special 
behavior is also possible. An outline of a proof of this result is given in 
Section 8. 


Structure of Space-Time 163 


ere ate 


sources 


FIGURE 9. The peeling-off property. 


The decomposition (6.19) was given purely algebraically. We may 
ask whether there is a more direct geometrical (or ‘‘ physical”) way of 
dinstinguishing the effect of the Weyl tensor from that of other parts of the 
curvature. Also, can one see physically what the meaning of the principal 
null directions is? The most direct physical manifestation of space-time 
curvature is to be found in the “‘ tidal forces”’ due to the geodesic deviation 
effect on timelike geodesics [82, p. 266]. The relevant equation can be 
written 


D* p* — R%,-4t’p°t? = 0 (7.8) 


where the ¢* are tangents to a congruence of timelike geodesics, smoothly 
parametrized according to proper time s, say, so t*t, = 1. The vector p* 
connects points of equal s value on two neighboring geodesics (Fig. 10). The 
operator D denotes covariant differentiation in the direction ¢’, that is, 


D=(V, (7.9) 


The equation (7.8) is familiar to mathematicians as the Jacobi equation. 
[Actually (7.8) holds under more general conditions than those stated, 
namely, that the parameter s need only be affine on each geodesic.] If we 
choose p* orthogonal to t* (as we may) then D?p* measures the relative 
acceleration of neighboring intertial test particles. [A simple “intuitive 
derivation’ of (7.8), when the two neighboring geodesics are initially parallel, 
is obtained by simply transporting the vector ¢t* around the loop ROPSR of 
Fig. 10. The change in ¢? gives us the change in the relative velocity of the 
neighboring particle. ] 


164 R. PENROSE 


FIGURE 10. The relative acceleration of neighboring geodesics in the presence 
of curvature. 


As applied to timelike geodesics, (7.8) does not enable us to separate, 
in any simple way, the effects due to the different irreducible parts of the 
curvature tensor. But if we use null geodesics, the effects due to the Wey] 
tensor and due to the (trace-free) Ricci tensor become sharply distinguished 
[83,102]. Since the case of null geodesics has some peculiar features which 
are rather suitable to a spinor treatment, I shall give an independent spinor 
derivation of the relevant equations, rather than trying to deduce them from 
(7.8). This will also give us an interpretation of some of the spin coefficients, 
which will be useful later. 

Choose a spin frame €,,4 and put 


Ei =o04 8 86,4=14 (7.10) 
From the normalization (4.27) we get 
o,i4=1 (7.11) 
Let us try to interpret the following spin coefficients [cf. (6.2)]: 


K = Yoooo’ P = Yoo10’ Oo = Yooo1’ (7.12) 
There are all of the form 04V¢,-0,, where €, D’ are, respectively, 0, 0’; 1, 0’; 
Ol’. Thus, under 
o4 + Co4 A+» (~ 1,4 (7.13) 
we have 
K>2CEK proklp «630! (7.14) 
the derivative terms canceling because of 040, =0. Now, consider the 
meaning of the condition x = 0: | 


040°0° Vac: 04 = 0 (7.15) 


Structure of Space-Time 165 


We can read this as something of the form 040, = 0, from which we can infer 
0, = 70,4, for some x. Thus k = 0 means 


0°0° Vac O4 = YO4 (7.16) 
Putting 
I? = 0404 = mt = 04j* (7.17) 


we rewrite (7.16) as /“V,0, = yo,, which states that o, is carried parallel to 
itself in the /°-direction (that is, in the o4-direction). That is to say, the 
condition « = 0 is necessary and sufficient for the o4-directions to be tangent 
to null geodesics. Thus, x is a kind of measure of curvature of the o4-curves. 
(It is complex, so it also indicates the direction of this curvature, in relation to 
the flag plane of 0%.) 

Suppose now that x = 0, so we have a congruence of null geodesic 
o4-curves, and read (7.15) in two other ways, namely, as 0°, = 0 or as 
0° we. =0. Thus, similarly to the above, we have relations of the form 


040° Vac: O4 = POp (7.18) 
040°V gc: O4 = 00c: (7.19) 


for some p,o. In fact, p and o are the quantities given in (7.12) as follows 
at once by multiplying (7.18) by 12 and (7.19) by i©. The significance of 
(7.18) and (7.19) is that it shows that when x = 0, the quantities p and o are 
independent of :*. Also, since they are simply rescaled according to (7.14), 
as o4 undergoes (7.13), we expect to find a simple geometric meaning for 
p and o, as a property of the congruence of null geodesics. To see that this is 
actually the case, consider a 2-plane element z at a point P of one of the 
geodesics y. Choose z spacelike and orthogonal to the direction of y. We 
can select 14 also to have direction orthogonal to z, so z will be spanned by 
m? and m’, with m’ as in (7.17). Now any (real) vector in z has the form 


Zm* + zm’ (7.20) 


SO we may regard z as being the Argand plane of 2 z. Inthe neighborhood 
of y, a two-dimensional set of the null geodesics ** hit’? z. These are the null 
geodesics which lie, to first order, in a null hyperplane through y. There are 
also other null geodesics which are neighboring to y, but we do not consider 
them. We shall examine how z varies as we follow one of the null geodesics 
which does hit z. [Note that the z, at P, for a particular null geodesic of the 
congruence, near y, is actually independent of the choice of z at P. Altering 
nm through P amounts merely to replacing m* by m?+ k/*% in (7.20), with 
k complex.] 
Put 


D = I°V, = Voo: (7.21) 


166 R. PENROSE 


For convenience, choose 04, 1“ to be parallelly propagated along y: 
Then °*, m*, and therefore also z, will be parallelly propagated alongy. Let@ 
be any function constant along the null geodesics: 
Dd =0 (7.23) 
We characterize the fact that we require a fixed value of z to label a definite 
geodesic of our system neighboring y by 
D{(zm’* + zm*)\V,o} =0 (7.24) 
[see (7.20)]. This gives 
D2Vo1- + D2Vio-¢ + Z(Voo: Vor — Vou Yoo)? + 2(Voo Vio 
— Vio Voo)@ =9 (7.25) 
that is, by (6.31) (or directly evaluating these particular commutators) 
(DZ + pz + 0z)Vo,,¢6+ (Dz + pz + .0Z)V_,¢=90 (7.26) 
Thus 
Dz = —pz—0Z (7.27) 
The interpretation of p and o now becomes clear. The real part of p 
measures the convergence of the null geodesics and the imaginary part of p 
measures their rotation about y. A nonvanishing a indicates the presence of 
distortion or shear (Fig. 11). Thus, if o #0, a small circle will become 
elliptical as we follow the null geodesics. The axes of this ellipse are defined 


in relation to the flag plane of 04 by targo. If the null geodesics are the 
generators of null hypersurfaces then we can choose /, = V,u, whence 


040" (V, l, — V, [) = 0 


Re p Im p 0 


FIGURE 11. The interpretation of p and a in terms of their effect on an initially 
circular pattern in the null rays. 


Structure of Space-Time 167 
from which we get absence of rotation: 
p=p (7.28) 


Condition (7.28) is also sufficient for the null geodesics to generate null 
hypersurfaces. If (7.28) holds, then the p and o pertain to the geometry of 
each null hypersurface individually and not to the relationship among the 
different null hypersurfaces. 

Let us next consider the D-derivatives of pando. Wecan obtain these 
from (6.30) [using (7.22): Yayo. = O], or else directly, using the definitions of 
~P, 9, Dapp, and Papep. Thus, 


Dp = p*+ 06+ (7.29) 
Do =o(p + p)+'? (7.30) 

where 
‘P = Poooo ® = Doo0-0 (7.31) 


Equations (7.29) and (7.30) are the Sachs equations [93] for the “* optical 
scalars”? p, 6. Combining these with (7.27) we get 


D?z = —@z — Wz (7.32) 


In fact, (7.32) can also be obtained directly from (7.8), if we put rf“ = /* and 
p* =zm* + zm’. In this connection, we note the formulas 


Y= Rag lm l'nt = O=R,,.4l4m? lin’; 
p=mmv,l, o¢= mm, 1, Oe) 
We see from (7.29) that it 1s essentially ® which governs the change in 
p and from (7.30) that it is essentially ‘Y which governs the change in a, but 
that in both cases there are additional nonlinear terms. To obtain the effects 
of ® and in purest form, consider a particular situation in which ® and P 
are localized to a very small region on y. [In fact idealized situations are 
also possible for which ® and become Dirac delta functions. See the 
discussion of plane waves (9.23)-(9.26).] Suppose that we have a pulse of 
light which is initially a parallel beam. This is represented by the null 
geodesics constituting a small portion (neighboring y) of a null hypersurface 
through y, for which p=a=0. Then, by (7.29), we see that a patch of ® 
curvature (with ‘¥ = 0) acts as a Jens without astigmatism (since convergence 
p is introduced into the beam but the shear o remains zero) while a patch of 
‘Y curvature (with ® = 0) acts as a purely astigmatic lens; that 1s, exactly as 
much positive convergence is introduced in one plane as negative convergence 
in the perpendicular plane (since p Is initially zero for the emergent beam with 
o nonzero). The interpretation of a patch of curvature in terms of lenses 
also holds good for a general incident beam, where p and o needs not initially 


168 R. PENROSE 


vanish. Also, the nonlinear terms in (7.29) and (7.30) correspond exactly 
to the way that the effects of separated lenses compose [78]. Thus, we may 
think of the effect on a beam of light, of the curvature of space-time, as being 
built up precisely from effects due to a series of lenses placed along the beam. 
The converging power is described locally by ® and the astigmatism by . 
As y varies, the components (7.31) generate a// the information contained in 
Y apcp and M,5c-p, because of the symmetries (6.21). 

One important feature of this is that we always have positive focusing 
along any null geodesic y, if we assume that the local energy density is positive 
definite. The energy density, measured by an observer whose world line 
has tangent vector f°, Is 


T,, (0° (7.34) 


if we assume f°t, = 1. If we require (7.34) to be nonnegative, then we have 
T,,¢°t® >0 for any timelike vector ¢* (irrespective of the normalization 
condition ont’). Thus, if we take the limit as f° approaches a null vector 7°, 
we get [by (7.3)] 


® = KT,,/*l’ >0 (7.35) 


(whether or not A = 0, since g,,/7/’ = 0). In the absence of , this shows 
that the curvature ® along the beam acts as a series of positive (converging) 
lenses. If Y is present, its effect is made felt in Eq. (7.29) via the o6 term, 
which again acts positively. We can regard o as being, roughly speaking, an 
integral of ¥ along y. Thus, Y acts nonlocally to contribute to the focussing 
power along y. An oscillating ‘ closely resembles the presence of ® in its 
total effect. This is closely related to the question of gravitational energy. 
Although locally the gravitational field does not contribute to the energy 
density, a gravitational wave does carry a kind of nonlocal energy, the total 
energy carried by the wave being always positive. (A striking example of the 
effective positive energy of gravitational ‘“‘ripples”’ is supplied by Wheeler’s 
[112] theoretical construct: the gravitational “‘geon.” This is a region of 
empty space which consists of ripples which remain bound for a long time. 
When viewed on a large enough scale, a geon appears as a “particle” with 
positive mass.) 

Let us examine the positive focusing effect in a little more ‘detail. If 
we choose a small triangle in the plane element z with vertices defined by 
0, z,, Z,, then its area 


a = 5 (21%, — Z4Z,) (7.36) 


satisfies [by (7.27)] 
Da = —(p + pa (7.37) 


Structure of Space-Time 169 


With our beam chosen as a portion of a null hypersurface we have p = /p. 
Thus, 
D(a'/*) = —pa!'!? (7.38) 
From (7.29), 
D?(a'!?) = D(—pa'!”) = —(06 + ®)a'? <0 (7.39) 


Thus, if p > Oat some point of y, (7.38) tells us that a'/? is decreasing. From 
(7.39) it follows that a'/* decreases to zero”® (see Fig. 12). Thus the beam 
inevitably reaches a focal point Q. (This will in general be an astigmatic 
focal point where the beam collapses locally to a line rather than a point.) 
Note that, near Q, p becomes unbounded since by (7.38), p = —1/2D (log a). 


Q affine parameter 


FIGURE 12. Once the local area of cross section of the null rays begins to 
decrease, it inevitably decreases to zero. 


This behavior is a limiting case of the Raychaudhuri effect [49, 86], which 
applies to a hypersurface orthogonal timelike congruence of geodesics. Let 
t“ be the tangent vectors to the geodesics of the congruence, parametrized 
according to proper time s (so ¢*t, = 1; with ¢* future-pointing) and orthogonal 
to a spacelike hypersurface s=0. Define the divergence of the geodesics by 


d=V,0° (7.40) 
From the Ricci identities we obtain Raychaudhuri’s equation 
DO = —V,t°V,t7 + Ra, tt? (7.41) 
with D = ¢°V, as in (7.9). 


20 There is One simple application of this property which does not seem to have been 
noted before, namely, to the question of whether a spherically symmetrical body in asymp- 
totically flat space-time can fail to have a center (see [9] p. 436). If we consider the light 
rays (null goeodesics) converging inward from an instantaneous sphere situated sym- 
metrically about the body at a reasonable distance from it, we see that the rays initially 
converge, so that p>0. They have to reach a (symmetrical) focal point (assuming (7.35) 
holds) and this focal point defines the center. (Compare the discussion of trapped surfaces 
given in Section 10.) 


170 R. PENROSE 
From the hypersurface orthogonality we get V,t, = V,¢,, whence 
V,t°V, t? = V,t, Vit? >4 07 (7.42) 


from Schwarz’ inequality in the 3-space orthogonal to ¢t° [by the geodesic 
condition ¢°V,¢, = 0 and r°V,t, = $V,(t%t,) = 0]. Thus, 


DO < — 407+ R,, t*t” (7.43) 
Let V be a small 3-volume element orthogonal to t°. Then 
DV = @6V (7.44) 
by (7.40), so 
D(v'/3) = fev? (7.45) 
and (7.43) gives 
D?(V1/3) = 4D(0V"/3) < 4V"/3R,, tt? < 0 (7.46) 
if we assume that the energy condition 
Tt > J + T°, (7.47) 


holds [cf. (7.1)]. (If A = 0, then (7.47) holds if in an “‘eigentetrad”’ of T.,, 
the energy density is not less than minus each principal pressure, nor less than 
minus the sum of the principal pressures. This would be true of all normal 
matter.) At this point the argument proceeds just as before, with (7.45) 
replacing (7.38) and (7.46) replacing (7.39), showing that if <0 at some 
point on one of the geodesics, then a focal point (V = 0, 8 + — oo) is reached 
somewhere in the future on the geodesic. Similarly, if @> 0, then a focal 
point (V = 0, 80 +00) 1s reached in the past on the geodesic. 

One final remark concerning (7.30) is appropriate here. We have, 
by (7.31), 


WY = WV cp 040%0%o” (7.48) 


sO, comparing with (7.4) and (7.5) we see that ‘WY vanishes when o% points in 
a principal null direction. Thus, we can give a physical interpretation of the 
principal null directions as those null directions (at a point P) along which 
no astigmatic focusing takes place. From the fact that astigmatism is a 
directional quantity, we can actually see from topological considerations that 
there must be at least a net algebraic count of four null directions of vanishing 
astigmatism at P. We consider an S? representing the null directions at P, 
on which we put a small line element to represent the astigmatism for each 
null direction at P. The principal null directions emerge as four points 
where the line element vanishes, as indicated in Fig. 13. This type of argu- 
ment applies to other spins (for example, to spin 1, namely, electromagnetism, 
when the analogous line element is oriented, so the Fig. 13 behavior cannot 


Structure of Space-Time 171 


principal null direction 


planes of astigmatism 


FIGURE 13. A principal null direction appears as a singularity in the field of 
directions of astigmatism, on the S? representing null directions 
at P. 


occur and we get just two principal null directions). The argument can also 
be used in other situations, such as for the asymptotic fields for which S? 
would be at infinity rather than at P. 


8 CONFORMAL INFINITY 


The question of the meaning of gravitational energy has been briefly 
touched upon in the previous section. The utility of the concept of energy, 
in general, arises from the fact that it is conserved. In Einstein’s theory there 
is a “local conservation law”’ for energy built in, namely, 


V,T® =0 (8.1) 


this being, by virtue of the field equations (7.1), a double contraction of the 
Bianchi identities: 


Via Rycyde a 0 (8.2) 


However, (8.1) does not yield an integral conservation law of the usual type, 
that is, a law stating that an integral over the boundary of some compact 
4-volume of the ‘‘ flux” of some quantity across this boundary necessarily 
vanishes. 

The prototype of such an integral conservation law is given by electric 
Charge. If J° denotes the charge current vector, then we have 


V,J7=0 (8.3) 


Since J* is a ‘“‘vector” or, more appropriately, the ‘‘dual of a 3-form,”’ 
the local conservation law (8.3) implies the existence of an integral conserva- 
tion law of the above type, where the flux of charge across a surface element 


172 R. PENROSE 


dS, is J°dS,. However, (8.1) isnot of thisform. Wecan get an approximate 
integral conservation law if we integrate over the boundary of some region 
whose dimension is very small compared with the radii of curvature involved 
in Rosca: For this, we may introduce a basis frame 6% which is approximately 
covariantly constant over the region, and read (8.1) as V, 7” 0. This is now 
of the form (8.3), so an approximate energy-momentum integral conservation 
law arises with energy-momentum flux T“°dS,. We may regard the space- 
time curvature as giving a (nonlocal) gravitational contribution to the energy- 
momentum which has to be taken into account to get an exact integral 
conservation law. This can be formalized in various ways (by the use of 
‘‘pseudotensors”’, etc.), but no method has yet emerged which, in general 
situations, assigns any kind of meaningful uniqueness to the gravitational 
energy-momentum flux, say, or to the total energy momentum in a region. 
] think we may regard the question of gravitational energy as one of the 
important unsolved problems in general relativity theory. 

But have we any right to expect a solution to this problem at all? 
Might it not be that energy-momentum is simply not quite conserved in 
general relativity, or, put another way, that the energy-momentum concept 
is not (except locally) really an appropriate one for the subject. It may be 
that in general this is so. I do not know. However, it is (to my mind) a 
very remarkable fact that a meaningful (that is, “‘covariantly” or “‘ geo- 
metrically’ defined) energy-momentum concept really does exist for an 
interesting subclass of space-times, namely, those which are asymptotically 


flat. 

The study of asymptotically flat space-times forms an important part of 
general relativity theory. This is not because it is believed that the universe 
is very likely to be asymptotically flat, but because in any situation in which 
general-relativistic effects are expected to be important—that is, with the 
notable exception of the study of the universe as a whole—the curvatures 
involved in the local process will exceed by many orders of magnitude the 
smoothed-out curvatures of the general background. (For example, even 
if we collapse an entire galaxy, the size of the resulting object of interest would 
be smaller than the radius of the background curvature of the universe by a 
factor of about 10''’. See the discussion in Section 10.) Thus, asymptotic 
flatness is an excellent approximation in a large number of situations. This 
is fortunate, because it 1s only with asymptotic flatness that general relativity 
begins to resemble much of the rest of physics. We can then discuss the 
question of advanced and retarded waves, gravitational or otherwise. We 
can discuss the energy carried by these waves, scattering problems, even 
perhaps quantization. | shall not go into any of this in any detail here. 
I merely describe a mathematically precise framework for the discussion of 
these questions and just indicate some of the applications. 

In order to fix ideas, let us consider, in more physical terms, the question 


Structure of Space-Time 173 


of whether gravitational waves emitted by an isolated system carry energy, 
and if they do, whether the energy they carry is necessarily positive. (The 
existence of gravitational waves emitted by a system of varying asymmetry, 
is inferred in the first instance, from the linear approximation to Einstein’s 
theory.) To find out what energy we should assign to the waves, we envisage 
measuring the mass 1, of the system before the waves are emitted and then 
measuring the mass mm, of the system after the waves are emitted. If energy 
(that Is, mass) conservation is to retain any meaning, we require that the 
total energy carried by the waves be just m,— #7, (assuming incoming 
waves are not present, which might be absorbed by the system). 

But how are we to measure the mass? One way might be to integrate 
some expression of mass density over a spacelike hypersurface Y. However, 
we would have to take into account the “nonlocal mass density”’’ of the 
gravitational field itself and this complicates things. As an alternative, we 
might prefer to measure the mass by examining the field only in the neigh- 
borhood of infinity on SY, since the way in which the curvature falls off at 
large distances on ¥ should fix the total mass. Instead of a 3-volume 
integral, we then use just a 2-surface integral at infinity. In fact, either 
method can be used to give a satisfactory definition. But in both cases, our 
spacelike hypersurface Y has to extend to infinity and we have a difficulty 
when it comes to measuring the mass m, — m, carried by the waves. We 
would “normally ”’ require that Y should suitably tend to become a spacelike 
hyperplane as our space-time approaches flatness at infinity. This is not 
unreasonable for the hypersurface “,, which intercepts the source’s world 
line before the emission of the radiation, to yield m, (Fig. 14). But in the 
case of such a hypersurface , , which intercepts the source after the emission 
of the radiation, we do not get m1, as our mass measurement, but simply m, 
again. This is because #, intercepts all the waves, and whatever energy they 
carry has to be added in, giving m, = m, + (m, — mj). 

We might try to carry out some complicated limiting process, allowing 
the hypersurface to move “‘ upward ”’ into the future while letting the size of 
the region of integration extend gradually out to infinity, the waves remaining 
Outside the region. Let us apply this to the second of the above two sug- 
gestions for measuring the mass. Then we are led to considerations of 
2-surface integrals at infinity, not in spacelike directions, but in null directions. 
This suggests the use of null (or, at least, asymptotically null) hypersurfaces 
M1, M,, in place of Y,, S,. Here W, lies entirely outside the cone of 
radiation, which in turn lies entirely outside ”,. Then, a 2-surface integral 
at infinity on &%,; might reasonably be used to measure m,(i = 1, 2), the 
difference m, — mz then giving the mass of the waves. 

By achieving such a definition of mass and by showing that the differ- 
ence m, — m, is positive in the presence of radiation, Bondi (with his co- 
workers) and subsequently, Sachs [8, 11, 95], made an important advance in 


174 R. PENROSE 


source 


FIGURE 14. To measure mass loss through radiation, .4°; and 42 are more 
appropriate than %,; and “2. 


the understanding of gravitational energy. Their definition depended on the 
introduction of a specific type of coordinate system based on outgoing null 
hypersurfaces. The existence of a coordinate system of the required type 
was taken as the definition of the degree of asymptotic flatness required. 
I propose to give another definition for asymptotic flatness here, which ts 
more evidently geometrical in character, but which 1s, in essentials, equivalent 
to the Bondi-Sachs definition. The idea involves the introduction of a 
boundary to space-time .@ whose points constitute future and past end-points 
to each null geodesic in .@. (We have just seen that the Bondi—Sachs mass 
is a 2-surface integral over “future end-points”’ to null geodesics, namely, 
the null geodesics which generate /,.) It turns out, somewhat miraculously, 
that Bondi-Sachs asymptotic flatness finds expression in the existence of a 
smooth conformal structure [74, 76] for the space-time with boundary. Having 
such a well-defined structure at infinity, we can then perform calculations by 
treating infinity as though it were a /oca/l structure and awkward asymptotic 
limits are thereby avoided. 


Structure of Space-Time 175 


Let us start by considering the nature of conformal infinity for 
Minkowski space-time .@. Choose null polar coordinates u, r, 0, g related 
to the usual Minkowskian x°, x’, x”, x° by 


u=x°-r v= x°+r (8.4) 
with r = [(x!)? + (x7)? + (x°)?]'/”? and 
x'=rcos0cosg x*=rsin@cosg x*=rsing. (8.5) 


Then 
ds* = du dv — 3(u — v)?(d0? + sin? 0 dg’) (8.6) 


(u<v). ‘Let us choose new coordinates p, g with 
v =tanp u=tang (8.7) 


(—27/2 <q <p < 7/2; see Fig. 15) so that points at infinity are assigned finite 


NHS 
SSS 
SO 


FIGURE 15. Therangeofpandg. To construct .#, rotate about p =q,so each 
point p >g describes an S$? (but keep /° fixed!). 


176 R. PENROSE 


Pp, g coordinate values. The metric (8.6) ceases to be meaningful at such 
values (p = 2/2 or g = —7/2) but if we pass to a conformally related metric 


d§ = QD ds (8.8) 
where we choose 
Q = (1 +.u?)7'/4(1 + v2) 1? (8.9) 
then (8.6) becomes 
ds? = dp dq — 1 sin*(p — q)(d0? + sin? 0 dg?) (8.10) 


The metric (8.10) is perfectly regular at p= 2/2 and at q= —72/2 
(except that at p = g or p — q = 7, we have removable coordinate singularities). 
Thus, we have a well-defined conformal structure on a manifold (—2/2 < 
gq <p < 2/2) with boundary?! (q = —n/2 or p = 1/2) whose interior (— 2/2 < 
g < p < 7/2) is identical in conformal structure with Minkowski space-time. 
Note that g = const. and p = const. are null hypersurfaces (cones) and are 
therefore generated by null geodesics. This then applies, in particular, to 
the boundary hypersurfaces. 

The metric (8.10) is, in fact, that of Einstein’s static universe &. If we 
allow 0 < p—q <7, with p + q unrestricted (and with the usual range for the 
spherical polar coordinates 6, @), then (8.10) refers to a ‘‘cylinder”’ 
&=S°*~x E', it being a simple matter to rewrite (8.10) in the form 
d3? = dT? — d&*, with d=? standing for the usual metric on a 3-sphere. (The 
regions of coordinate singularity in (8.10) can now be covered by applying 
a rotation in S° to the coordinates p, g, 0, ~.) The portion of & which 
corresponds to .@ can be described geometrically as follows (see Fig. 16). 
Choose a point J~ on @ (this will be p = g = —72/2). The null geodesics on 
& which have past end-point at J~ will generate the future null cone of 
I~. These null geodesics will reach a first focal point which is diametric- 
ally opposite (with respect to S*) to J~. Call this point J° (this is to be 
p= —gq=n/2). 

The (open) null geodesic segments from J~ to J° sweep out a hyper- 
surface region %~ (to be —2/2 < p < 2/2, g = —2/2) which is bounded at 
the past by J~ and at the future by J°. (~~ has topology S* x E'.) The 


21 Strictly speaking, we do not quite have a manifold with boundary here, because 
of “‘corners’’ on the boundary. These will be removed for the strict definition to be given 
shortly. 

There is another version of compactified conformal Minkowski space-time in which 
the future boundary hypersurface is identified with the past. Then the whole structure 
becomes a compact conformal manifold without boundary (~S! x S3%) (see [51, 92]; 
also [76, 79]). 


Structure of Space-Time 177 


: A 
cE ~ , i y : 
i i 


FIGURE 16. Conformal Minkowski space-time as a portion of the Einstein 
universe. 


hypersurface .£~ defines the past null cone at 7°. Continue these null 
geodesics into the future beyond 7°. They reach a second focal point J* 
(to be p = g = 2/2), which is diametrically opposite to 7° (with respect to S*) 
and therefore lies directly ““above” J~ in &. The (open) null geodesic 
segments from J° to J* will sweep out a null hypersurface region ¥ * (to be 
p=n/2, —n/2<q< 7/2), again of topology S* x E', and bounded by J° 
at the past and J* at the future. The part of & which may be identified with 
M (as regards conformal structure) is then just the set of points of & which lie 
“between” %~ and ¥ * (that is, the set of points which lie between J~ and 
I* on timelike curves from J~ to J*). Thus .@ is an open subset of & and 
its boundary isJ~ U.%¥~ VI°U FT UI. 

A useful way of picturing .@ is as the interior of two cones joined base 
to base (Fig. 17). Then the various parts of .@ are represented as portions 
of these bounding cones. We must. bear in mind that this picture is not 
conformally accurate, however. The inaccuracy is greatest near /°, as 
emerges particularly since 7° appears as an equatorial region whereas it 


178 R. PENROSE 


consider 
as one point 


FIGURE 17. Any null geodesic in -# attains two end-points in .#, one on %-— 
and one on ¥ +, 


should be a point. (Of course, to be accurate, we should also have to think 
of the picture as four-dimensional.) 

The interpretation of the points J~, /°, J* is as “ past timelike infinity,” 
“spacelike infinity,” and “future timelike infinity.” The hypersurfaces #%~ 
and %* are “past null infinity’? and “future null infinity,’’ respectively. 
The reason for labeling them as such becomes evident if we examine straight 
lines in .@ (straight according to the Minkowski metric ds). A timelike 
straight line becomes a curve from J~ to J* ; a spacelike straight line becomes 
a closed curve through /°; a null straight line is a null geodesic originating 
at a point of ¥ ~ and terminating at a point of #*. In each case, the curve 
becomes “‘ compactified” by the addition of the appropriate points at infinity. 

The extension of .@ as a conformal manifold does not, of course, 
uniquely lead to the entire conformal Einstein universe. For, beyond the 
boundary .@, we could bend the extended space as we wished. However, 
the closure .@ of .@ in @ is uniquely defined as a conformal manifold with 
boundary. For this to be quite precise, it is more logical to delete the points 
I~, 1°, 1* from the definition of @ and also from MW = Y, since otherwise 
M would not be a manifold with boundary at these three points. This will 
also fit in more accurately with the discussion of asymptotically flat space- 
times. (However, I shall not always be completely consistent in this respect, 
since it is useful to be able to talk about J~, 7°, 7*.) There are various ways 
to see that this .@ is conformally unique. Perhaps the most direct is to 
construct the points of #% in terms of equivalence classes of null geodesics 
in .@ [an alternative way of defining the points of .@ is in terms of event 
horizons and particle horizons (cf. Section 9). The nontrivial horizons 


Structure of Space-Time 179 


can be used to define the points of .@]. The concept of a null geodesic is a 
conformally invariant one, so such a construction is appropriate here. We 
have to decide when two null geodesics in .@ are to be deemed as intersecting 
on .. A method given by Geroch [34, 35] (see Dr. Geroch’s lecture, 
Chapter VIIT) can be used here to treat quite general space-times. In the 
present case, where .@ is Minkowskian, two null geodesics meet at the same 
point of 4% * if and only if they belong (according to the ds metric) to the 
same null hyperplane in M. If they are merely parallel in ./, then they just 
meet the same generator of J*. For this particular space-time .@, two null 
geodesics meeting on 4% * must also meet on ¥ ~ and vice versa. However, 
this is a very special property of Minkowski space-time and will not hold for 
general .%. 

Let us pass to a slightly more general space-time .#@, namely, the 
Schwarzschild solution for the metric of the exterior field of a spherically 
symmetric massive body. I shall use the Eddington—Finkelstein [27, 32] 
form of the metric with one retarded null coordinate u: 


ds* = du*[1 — (Qm/r)] + 2dr du — r?(d0? — sin? 0 dg’) (8.11) 


(r>0). (This metric is discussed more completely in Section 10.) If we 
choose d§ = Q ds, with 


Q=r'=], (8.12) 
where / is to be a new Coordinate which is finite for r = oo, then we get 
d3? = du?(I* — 2ml°) — 2dl du — d0* — sin* 0 do? (8.13) 


(/>0). This is regular at #*, defined by /=0, wu finite. Here %* is 
S? x E'asbefore. To find ¥ ~, we re-express (8.11) in terms of an advanced 
null coordinate v = u + 2r + 4m log (r — 2m): 

2m 


ds* = av*(| — =") — 2dr dv — r?(d0? — sin* 0 do’) (8.14) 
r 


(r > 0) and again use (8.12). This gives, 
d§* = dv?(I* — 2ml?) + 2dl dv — d0* — sin? 0 dg? (8.14a) 


(/ > 0), which is regular at # ~ = S? x E', defined by /=0. The only real 
difference between this case and the previous Minkowskian one, as regards 
the structure of infinity, lies in the fact that we do not get regular points 
1~,1°,1* here. It is not surprising that 7~ and J* turn out to be singular, 
Since the source becomes concentrated at these points, at the two ends of its 
history. But J° is also a singular point, having infinite conformal curvature 
(although a C° conformal metric can be assigned at J°). Thus we omit 
I° and J* from the definition of .@ (and from ¥ = %~ UF*). 


180 R. PENROSE 


The form of the Schwarzschild metric given in (8.11), (8.14) is a special 
case of 


ds? = r~*Adr? — 2B, dx'dr + rGs dx' dx! (8.15) 


(i, j = 1, 2, 3), where A, B,, C;, are functions of x', x”, x°, and r, such that if 
we put x° =r—', then these functions are to be sufficiently differentiable 
(say C*) as functions of x* (a = 0, |, 2, 3) at and in the neighborhood of the 
hypersurface % defined by.x° = 0. If we assume that the relevant determinant 
in A, B;, C,, does not vanish, then clearly the metric d§ = Q ds, with Q=r7! 
is regular (C°) on ¥. The metric (8.15) includes all metrics of the Bondi- 
Sachs type, so a regularity assumption for ¥% seems a not unreasonable one 
to impose if we wish to study asymptotically flat space-times and allow the 
possibility of gravitational radiation. In fact, with no field equations imposed 
to restrict R,,, (8.15) is more general than the Bondi-Sachs metrics. In 
particular, (8.15) includes the de Sitter space-time and also asymptotically 
de Sitter space-times. These are caSes which can arise if we have a cosmo- 
logical term present in Einstein’s vacuum equations. 

The choice of Q for (8.15) which is required to make the metric ds regular 
at %, has the important property that its gradient at % does not vanish 
(0Q/dx* = (1, 0, 0, 0) at 4% ) and thus defines the normal direction to ¥ there. 
It is convenient to supplement an assumption of regularity at 4% with the 
condition that Q has nonvanishing gradient at %, but under certain circum- 
stances this condition can be deduced from the regularity assumption. 

Let us be more precise as to the conditions to be satisfied. We require 
that our space-time .@, with metric ds”, be extendible to a conformal manifold 
with boundary .&@ > .@ (int M = M; M = M — MM) such that 


THERE EXISTS A SMOOTH (SAY C°, AT LEAST) REAL-VALUED 
FUNCTION Q(> 0) ON .@ AND A SMOOTH PSEUDO-RIEMANNIAN 
METRIC d§* ON .@ (CONSISTENT WITH ITS CONFORMAL STRUCTURE) 


SUCH THAT ds? = 9? d3? oN (8.16) 
ON .%, WE HAVE 2 = 0, V,2 #0 (8.17) 
EVERY NULL GEODESIC IN .@ HAS TWO END-POINTS ON (8.18) 


If such an @ exists, we call .@ asymptotically simple. Then @ is 
unique (as follows, for example, by Geroch’s construction of .@ in terms of 
null geodesics [34]). We write W = J, for the points at infinity for 4. 

The condition (8.18) is to ensure that the whole of infinity for .@ has 
been included. It is generally difficult to verify in practice, however, and is 
not even satisfied by some space-times that one might like to think of as being 
asymptotically flat but which contain bound null orbits (that is, null geodesics 
which do not escape to infinity such as are found at r = 3m in the Schwarzs- 
child solution). In order to cover this type of possibility, let us call a space- 
time .@ weakly asymptotically simple if an asymptotically simple .@, exists, 


Structure of Space-Time 181 


such that for some open subset ¥% of M@_, with Wc #, the region WH), n x 
(with metric induced from ./ 9) is isometric with a subset of .7. That is to 
say, a weakly asymptotically simple space-time possesses the conformal 
infinity % = ./, of an asymptotically simple space-time but it may possess 
other “‘infinities”’ as well. 

To obtain the connection with Bondi—Sachs asymptotic flatness, let us 
suppose that ./ is weakly asymptotically simple with conformal infinity 4. 
Assuming Einstein’s equations (7.1), it can be shown that, provided T,, does 
not approach some nonzero multiple of g,, near %, then 


JF IS SPACELIKE, TIMELIKE, OR NULL, ACCORDING AS THE COS- 
MOLOGICAL CONSTANT / IS POSITIVE, NEGATIVE, OR ZERO. (8.19) 


[Also, if 4 #0, then we do not need to assume the condition V,2 #0 of 
(8.17) but we can deduce it.] If 4% is spacelike or null, then # is the disjoint 
union of two null hypersurfaces ~~, %*. The points of ¥~ are distin- 
guished from the points of £ * by the fact that their future null cones, rather 
than their past null cones, lie in @. We are interested here in asymptotic 
flatness, rather than cosmology, so let us assume that ¥ is null. Then it 
follows?” (provided ./,, in addition to being time- and space-orientable, is 
free of closed timelike curves—possibly an unnecessary assumption) that 


I~ = $+ = S? x E! (8.20) 


the E'’s being the null generators of ¢ *. The result (8.20) shows that when 
F is null, it fairly closely resembles the infinity of Minkowski space-time. 

To proceed further, we need stronger conditions on the way that 7,, 
behaves near ¥. For simplicity of the discussion, let us assume, for the 
moment, that 7,, = 0 (and 4 = 0) in the neighborhood of .%. Then more 
can be said. In fact, 


Vac Vey Q = 0 AT JI (8.21) 
and as a consequence of (8.21) and (8.20), cf. [76], 
Y ascp = 0 AT F (8.22) 


[We can also obtain VQ ¥ 0 at Y, rather than assume it in (8.17).] On the 
basis of (8.22) it is possible to deduce the peeling-off property for principal 
null directions that was mentioned in Section 7. The arguments involved 
in this will be indicated next. 

In spinor form, the Bianchi identities (8.2) become 


Vp Vasco = V8 Dcpg'P’ = 2EB(C V py’ A (8.23) 
Thus, with T,, = 0, we get 
a eee — 0 (8.24) 


22 An outline of an argument for obtaining this result is given in an appendix in [76]. 
It would seem that there must be a more satisfactory way of obtaining the result, however. 


182 R. PENROSE 


Equation (8.24) is of interest, because it is the curved-space version of the 
zero reSt-mass free-field equation in the special case of spin s=2. For 
general nonzero spins s = 4, 1, 14, 2, ... we would have 


V4" PB Pee io (8.25) 


where 


PaBs...L = Pras... L) (8.26) 


is a spinor with 2s indices. For s=4, Eq. (8.25) is the Weyl neutrino 
equation (zero-mass Dirac equation). For s = 1, (8.25) is the spinor version 
of Maxwell’s free-space equations 


Via F 4c} = 0 V°F ob = 0 (8.27) 
with the Maxwell field tensor 


Fan = — Fa = Paptae + Esp Pa'B’ (8.28) 


In flat space-time, (8.25) is a very satisfactory equation, the solutions 
having the same number of “degrees of freedom” (namely two) for any 
value of s as they do in the two cases s = 4,1. However, in curved space-time 
we have the consistency condition [16, 84], 


(= 2s) H4PM Pp... L)ABM = Vp VP @agcn...L 
8.29 
= Vp. VP Pasco...L = 9 ( 
so that if s > 3, (8.25) implies an interconnection between g,,_ , and the 
conformal curvature spinor ‘Pygcp. In the case @ygcp = Yascp; (8.29) is, 
however, automatically satisfied because of symmetry considerations. We 
leave aside the general question of higher-spin field equations here and 
interpret (8.25) only in the cases s = 4 (neutrinos) s = 1 (electromagnetism) 

and s = 2 (gravitation). 

An important feature of (8.25) is that it is conformally invariant with 


Oas..L= Qo Pago (8.30) 


We require various conformal transformation formulas. From d§ = 0 ds 
we get 


Gab = 079s g” = Q- +g” (8.31) 


SO we can set?? 


23 A somewhat different convention has been used here for treating spinors under 
conformal transformation than that used in [76], resulting in different transformation 
formulas. This difference is forced upon us by the attitude adopted in Section 3. In fact, 
the formulas obtained here are a little simpler than those of [76] and fractional powers 
of 22 are avoided. 


Structure of Space-Time 183 
Epa Oe, type Q&age PaO 1" Fao” 6.32) 
For a scalar y, we have 
Vix = Vox (8.33) 
For indexed quantities we have, for example, 
Van’ Ca = VaaSp— Vea Sa Vana Ne = Vaane — Vapna 
Vaa GP Va Oot Eg Veg Van n =Vaan” +84 Van” 


where 


(8.34) 


Yue = Q7'VAY (8.35) 


To treat a quantity with more than one index we simply treat it as though it 
were a product, according to the Leibnitz rule. That is to say, we get one 
term in Yyy- for each index. To verify the correctness of (8.34) we need only 
show that the axioms (4.50)-(4.58) are satisfied. 

From (8.34), we get that (8.30) substituted in (8.25) yields 


\ al OP Be ea 0 (8.36) 


which is the required conformal invariance. However, when applying this 
to the gravitational field, we have to be careful. The Weyl tensor has the 
conformal invariance property C%,.4 = Cseq, that is, Coreg = 27Copeg- Hence 


Pasco = Vasco (8.37) 
Thus, if according to the physical metric ds we set 
Pasco = Pasco (8.38) 
then we have [by (8.30) and (8.37)] 
Pasco = 2" Vascn (8.39) 


It is the field @,5¢p which we identify with the gravitational field when we 
examine .@ according to its metric d$. 

. The significance of this lies in the behavior of the field on ¥ and the 
relation of this to the peeling-off property. Recall, from (8.22), that 
YP apcp vanishes on ¥. From (8.39) and smoothness we obtain: 


@apcp IS CONTINUOUS AT % (8.40) 


Since it costs us no more at this point, let us derive the peeling-off property 
for general spin s on the basis of the assumption that 6,4,  , iS continuous 
at #. 

A 


We require a spin frame 0% = &9%, 14 = €,4. For the d§ metric set 


04 = 04 = QL, 64 = 07 '!o04 74 = 14 (8.41) 


184 R. PENROSE 


This is consistent with (8.32) and (4.27). The reason for the asymmetrical 
choice (8.41) is that we wish to arrange o, to have its direction pointing along 
a null geodesic y, with 0,, 1, parallelly propagated (according to ds) along n: 


Voo° o4 — 0 Vig = 0 (8.42) 
Then (8.34) and (8.41) yield 
V50: 64 = 0 V5gri4 = Q)7 164V 9g (8.43) 


Thus o% is parallelly propagated according to d$ along n and attains a well- 
defined value at Q* =nA%*. In fact 74 also has a well-defined limiting 
value at Q+* and points along %* at Q* (so V,..Q=0 at FY; cf. [76)]). 
The affine parameter r on yn (affine according to ds) defined by 


Voor = 1 (8.44) 
has the asymptotic behavior 
r~Qi (8.45) 


Now, by hypothesis, the components @yy 9, according to the 
64, 14 spin frame, are all continuous at Q*. Substituting (8.30), (8.41), and 
(8.45) we interpret this in terms of the physical quantities as 


lim {r?°*!~‘o jn} Exists (i =0, 1,..., 2s) (8.46) 
r>+to 
where @,,, stands for @o9..o1...14 With i “ones” and 2s—i “zeros.” 
Equation (8.46) tells us that the r~* part of g,, __, along n satisfies 


PAB... DE...LO°0 - 0° =0 (8.47) 


(k = 1, 2, ..., 2s) where k spinors 0% appear in (8.47). Condition (8.47) is 
just the condition for at least 2s —- k +1 principal null direction of og, _, 
to coincide in the direction of 4. This, then, is the peeling-off property. 

For the case of gravitation, this peeling-off property is a characteristic 
feature of Bondi-Sachs asymptotic flatness. It is possible also, to relate 
more directly the precise type of coordinates used in the Bondi—Sachs 
approach to the conformal structure that arises in the present formalism. 
This has been done explicitly by Tamburino and Winicour [104]. In the 
present approach we may thus define asymptotic flatness for M just to be 
weak asymptotic simplicity, in the case when 2 = 0 and 7,, =0 near ¥. If 
T,, only approaches zero in some way toward infinity, we may need further 
conditions, such as (8.21), (8.22) in addition to weak asymptotic simplicity. 
In the case of the Einstein—Maxwell equations holding near %, then (except 
for a possibility of some pathological exceptional cases) such extra conditions 
are not needed, and peeling-off can be deduced for both gravitation and 
electromagnetism. | 


Structure of Space-Tim 185 


FIGURE 18. The Bondi-Sachs energy-momentum is an_ integral over 
R=N Ft. 


Let us return, briefly, to the question of energy-momentum with which 
we Started. If the outgoing null hypersurface meets ¥ * in the 2-surface 
& (Fig. 18), then we can perform an integral over & to obtain the total 
energy-momentum intercepted by any hypersurface spanning &. If we 
choose © so that the metric of & is that of a unit sphere, then we integrate 


6N—Y, (8.48) 


over & to get the energy component (associated with this choice of Q) and we 
integrate (8.48) with a suitable weighting factor to get the momentum 
components. Here 


VY, = Pooit (8.49) 


on &, so ‘¥, comes from the r~° part of the Weyl curvature in the peeling-off. 
The quantity @ is the shear ['o99,, of W at Z (with 64 and i4 describing the 
null direction in WY and ¥, respectively). The quantity VW is the Bondi- 
Sachs “‘ news function”’ given here by 
N = jj 00" (8.50) 
Although this 1s a Ricci tensor component, it retains a certain conformal 
covariance by virtue of the above restriction on Q. The whole energy- 
momentum turns out to have the correct transformation behavior for an 
“asymptotic 4-vector.”’ 
The Bondi-—Sachs mass-loss formula finds expression in the fact that if 
we repeat the mass integral with a new null hypersurface lying to the future 
of the original one, the mass value so obtained is always less than or equal to 


186 R. PENROSE 


the earlier mass value. The mass carried by the gravitational waves turns 
out to be the integral of NN over the part of .% * lying between the two null 
hypersurfaces. In fact N may bedescribed as an integral, along the generators 
of ¥ * of P, = $111;- [This follows from (8.50), in fact.] In the peeling-off, 
YW, is ther! part of the gravitational field and so may be thought of as the 
outgoing radiation field. ‘“‘ News function” is therefore essentially just the 
‘‘time-integral of the outgoing gravitational radiation field.” It turns out 
that the Bondi-Sachs mass loss formula is closely related to the positive 
focusing property (7.39) on a null line y which lies just inside #*. Here 
N takes the place of o in (7.39) and P, takes the place of V. (See [78].) 

It may be remarked that the mass loss occurs irrespective of the presence 
of incoming radiation (that is, analog of ‘¥, on #%~). Such asymptotic 
flatness assumptions at .4 * do not exclude incoming waves (as seems to have 
been believed at one time), although they forbid a too large concentration of 
incoming waves just inside .*. To see the mass gain due to incoming 
radiation, we should have to turn the above construction upside-down and 
perform our calculations on 4% ~ instead. 

As a final remark concerning energy in general relativity, we see that 
while the (future-measured) mass always decreases, there is no guarantee 
that the mass measurement never becomes negative! In the cases in which 
all sources finally are radiated away to £ * as zero-rest-mass fields, we can 
say from the Bondi-Sachs formula that the total mass was always positive.?* 
But to my knowledge, it is an unsolved problem whether the total mass is 
positive in general situations where we just assume a Suitable physically 
reasonable inequality on T,,. 


9 HORIZONS 


When examining specific cosmological models, it is sometimes useful 
to represent infinity conformally, even though the space-time may not be 
(weakly) asymptotically simple. For example, in the usual Robertson- 
Walker “big bang’’ models, we can represent the initial infinite curvature 
singularity as a nonsingular hypersurface boundary in the past, but only if 
we allow Q = o0 at this region. The nature of such boundary hypersurfaces 
is closely related to the question of four types of horizon which may occur in 
cosmology: the event horizon or particle horizon [88] and future or past 
Cauchy horizons [40, 42]. These will be defined precisely in a moment. 

I shall first require some definitions. By a curve I mean the image of a 
(noncritical) C'-map f of a real interval of nonzero length, where I shall 


24 Brill [14] obtained some partial results concerning positiveness of total mass of a 
system as measured on a spacelike hypersurface. A promising new approach to this 
problem has recently been given by Brill and Deser [14a]. 


Structure of Space-Time 187 


demand that every point of .@ has a neighborhood 2 for which f~! 9 has no 
noncompact connected component. Thus end-points are required (at which 
the curve would have to be smooth, and therefore extendible) except where 
the curve continues indefinitely in one direction or the other. (Otherwise we 
could refer to an open-ended curve.) If the map f itself is required, this is a 
parametrized curve. A curve is timelike [resp. nonspacelike] if the tangent 
vectors of the map fare all timelike [resp. timelike or null]. A nonspacelike 
curve has a natural orientation, called a future orientation, which is induced 
by the time-orientation of 4. 

If P and Q are two points of .@, then (see [50]) the notation P < Q is 
used if there is a timelike curve in M with past end-point P and future 
end-point Q. The notation P < Q is used if either P = Q or there is a non- 
spacelike curve from Pto Q. If P< QandQ<« Rorif P< Qand O<R, 
then P<R. If P< QO (with P # Q), then either P < Q or there is a null 
geodesic from P to Q (or both). The chronological future and past of P, 
respectively, are 


I,(P) = {X:P « XxX} I_(P) = {X: X < P} (9.1) 
These are open sets, as is easily seen. On the other hand, the sets 
J ,(P) = {X: P< X} J_(P) = {X: X < P} (9.2) 


are not necessarily closed. For example, let .@ be Minkowski space with a 
point removed and let P be on the null cone of the removed point. For 
another example, consider a plane wave, cf. (9.23)-(9.31). If # is any sub- 
set of .#, the chronological future and past of # are denoted, respectively, by 


1,4] = UL) T_(#] = UTC) (9.3) 


Again these are open sets. The closures of these sets are 
TL] = {XL (X) CI} T= {XI (X) <1 ,.0477} 9.4) 
The proof of (9.4) is easy and will be omitted here. From (9.4), we obtain?° 
the boundaries of I, [XH]: 
1.[4] = {X:1,(X) cI, [4], X €1,[40]} 
[_[4#7] = {X:1_(X) cI_[4], X €1_[7}} 


A set of the form I,[%] will be called a semispacelike boundary or 
SSB. In fact, this is time-symmetrical since 


[,[(#] =1_[2] (9.6) 


(9.5) 


25 The notation ~ © # means just that is a subset of #, not necessarily a proper 
subset (so YC WH). 


188 R. PENROSE 


where £ is the complement in .# of 1,[%#]. Conversely, (9.6) still holds if 
it is Y that is an arbitrary subset of .@ and ¥ 1s, for example, the complement 
of I_[¥]. A semispacelike set, in general, is a subset of .# no two points 
X, Y of which satisfy ¥ < Y. Clearly an SSB 1s semispacelike. Although 
SSB’s need not be smooth subsets of .@, they have certain nice properties. 
These will play an important part in the “‘singularity theorems” of Sections 
10 and 11. For the moment, we simply note that: 


Any SSB Is A C° THREE-DIMENSIONAL SUBMANIFOLD OF “4 
(WITHOUT BOUNDARY; AND IS A CLOSED SET) (9.7) 


To see that an SSB is locally homeomorphic to E*, we examine a small 
normal coordinate neighborhood (x*) of any point P of the SSB (withV, x° 
timelike and V, x' spacelike, i 4 0) any curve x' = const. (i 4 0) which is near 
enough to P intersects both J,(P) and J_(P). It therefore intersects the SSB 
at a point which must be unique. This gives a local homeomorphism onto 
an open set of the R® of (x', x?, x*) (Fig. 19). 


| 
LW 


FIGURE 19. Any SSB is locally homeomorphic to E?. 


We are now In a position to define our four types of horizon. Let y be 
any timelike curve in .@. Then (when nonvacuous) 


EVENT HORIZON OF y = I_[y] 
. (9.7a) 
PARTICLE HORIZON OF y = I, [7] 


If y has a future end-point F, then its event horizon is simply [_(F); if y has a 
past end-point P, then its particle horizon is [,(P). These cases will be 
called trivial. An example of a space-time with no nontrivial event or 
particle horizons is the Einstein static universe mentioned in Section 8. 
However, Minkowski space-time does possess nontrivial event and particle 


Structure of Space-Time 189 


gt 


ee | ee CUS To 
Be ae va Xv unobservable events 
vy a 
eN Sones 
"EN oS, 
observable events ne 
Va observer y 


FIGURE 20. If 4% * is spacelike, every timelike curve possesses an event horizon. 


horizons, given, for example, when y is a path of uniform acceleration. 
[See (2.8)-(2.10) and Fig. 6. If y is ¥ = Y=0, Z =const. > 0, then z= tf 
is an event horizon of y and z = —t isa particle horizon of y.} On the other 
hand, no timelike geodesic in Minkowski space-time has an event or particle 
horizon. Physically, the event horizon of y separates those events which are 
observable by an observer whose world line is y from those events unobserv- 
able by him. Similarly, the particle horizon of y separates those events 
from which a particle with world line y can be observed (in principle), from 
those events from which the particle cannot be so observed. 

Examples of space-times with nontrivial event or particle horizons for 
geodesic curves y, are obtained if .@ possesses a spacelike conformal infinity 
J* or J~. (We allow 240 or Q=oat ¥.) If #7 is spacelike, then 
any timelike geodesic y (indeed, any inextendible timelike curve in .@) 
through a point of .@ near enough to ¥ *, will hit % * at some point Q in 4. 
We can define I[_(Q) (from the conformal structure of .@) and this will be 
the required event horizon I_[y] (Fig. 20). Exactly similarly, if %~ is 
spacelike, we can find timelike geodesics y with nontrivial particle horizons 
(Fig. 21). 


particle y 
\ e/ 
7% Of ° 
y visible 7: . | 


7022" y invisible’ | 


FIGURE 21. If %~— is spacelike, every timelike curve possesses a particle horizon. 


190 R. PENROSE 
To define the Cauchy horizons, let Y be any closed semispacelike set. 
Define the domains of dependence D,(S/), D_(f#) as follows: 


D,(Sf) IS THE COMPLEMENT OF THE UNION OF ALL TIMELIKE 
CURVES WHICH ARE INEXTENDIBLE IN THE PAST AND WHICH DO 
NOT INTERSECT (9.8) 


with D_(f/) as just the time-reverse of (9.8). Another way of putting this is 


X € D,(S) IF AND ONLY IF EVERY TIMELIKE CURVE THROUGH X 
CAN BE EXTENDED INTO THE PAST UNTIL IT MEETS & (9.9) 


and similarly for D.(Y). [We have Y< D,(f).] It is readily seen that 
the past-inextendible curves not meeting Y sweep out an open subset of -/. 
Thus, by (9.8) 


D(A) AND D_(S/) ARE CLOSED SETS (9.10) 


The Cauchy horizons (Fig. 22) are the “future boundary” of D,(/) and the 
‘past boundary” of D_(f): 


H (SP) = {X: X €D,(S), 14(X) ODS) = O} 


(9.11) 
H_(S) = {X: X€ D_(S), I(X) A D_(S) = B} 
We have, easily, 
Hi(Sf) AND H_(f/) ARE SEMISPACELIKE CLOSED SETS (9.12) 


FIGURE 22. Domains of dependence and Cauchy horizons for a semispacelike 
set / (in a “‘trivial”’ case). 


Structure of Space-Time 191 


We can define the edge of a closed semispacelike set Y as the set of 
points Pe ¥ such that for any open set 2 containing P, if R <P < Q with 
R, Qe 2 then there is a timelike curve in 2 from R to Q not meeting &. 
We have 


edge (Y) CH,(S) AH_(f) (9.13) 


The set edge (/ ) is always closed. It consists of those points of Y at which 
FY is not locally homeomorphic to E°. If edge (Y) = @, we can call S 
edgeless. If S isa smooth semispacelike set which is a hypersurface (without 
boundary) in .@ (so the tangent hyperplanes are everywhere spacelike or 


null), then Y is edgeless if and only if Y is a closed subset of .@ (that is, 
SF is properly embedded). As another example, any SSB is edgeless. 

A Cauchy horizon (nonvacuous) will be called nontrivial if it is the 
H,(Sf) or H_(f) of some edgeless closed semispacelike set Y. Trivial 
Cauchy horizons are easily constructed (in space-times with no closed time- 
like curves) by just taking Y to be a small closed portion of a spacelike 
hypersurface. The Einstein static universe contains no nontrivial Cauchy 
horizons. On the other hand, Minkowski space-time does contain nontrivial 
Cauchy horizons {for example, if Y is x° = x', with usual Minkowskian 
coordinates, then Hi(S) =F; if F is x° = —[1 + (x!)? + (x7)? + (x)? ]!, 
then H_(SY) =, but Hy(f) is x° = —[(x')? + (x?)? + (x9)?]!/?7}. 

A global Cauchy hypersurface or GCH for .@ is a semispacelike set S 
(normally a smooth hypersurface) for which H,(A) and H_(f) are both 
empty. That is to say, Y meets every inextendible timelike curve in .Z. 
An equivalent condition on a semispacelike set Y for it to be a GCH 1s that 
it should intersect every inextendible null geodesic in M in a nonvacuous 
compact set. This will follow from the later considerations concerning the 
structure of Cauchy horizons. It is not hard to see that the intersection of 
a null geodesic with an SSB must be connected. One property of a space- 
time .4, containing a GCH Y, which is easy to establish is that it is topologic- 
ally YS x E', where the E'’s can be represented as timelike curves. It is less 
easy to establish, but still true,?° that all the Y’s of the product can also be 
made to be GCH’s for .@. It also turns out that the existence of a GCH 1s 
equivalent to Leray’s ‘“ global hyperbolicity ” condition.?’ 

It is clear that the usual ‘‘ constant time” sections of the Einstein static 
universe and of Minkowski space-time are GCH’s. Among the examples of 
space-times which do not contain any GCH are all those which possess a 
timelike conformal infinity 4% (Fig. 23). Any point P near ¥ lies on timelike 
curves to the future and past which reach ¥ in a region near to P (from the 
point of view of M). Thus, any candidate Y for a GCH would have to 


26 One proof of this is given by Seifert [98]. 
27 This will be shown elsewhere. 


192 R. PENROSE 


FIGURE 23. If ¥% is timelike, every semispacelike set has a Cauchy horizon. 


come near to P. Since this would apply to a// points near .¥, some of which 
have a definite timelike separation from each other, we see that this cannot 
be achieved with a semispacelike /. 

The de Sitter and anti-de Sitter space-times provide explicit examples 
where ¥ is spacelike and timelike, respectively. To describe these models, 
we consider hyperspheres in five-dimensional pseudo-Euclidean space with 
two choices of signature. De Sitter space-time (Fig. 24) is given by 


Ce=—-V?74+W74+X?74+ Y? +2? (9.14) 
where the metric is 
ds? = dV? — dW? — dX? — dY* — dZ? (9.15) 
Anti-de Sitter space-time (Fig. 25) is the universal covering manifold of 
C?=V74+ Ww? —-x?-Y?-—2Z? (9.16) 
with metric 
ds? = dV? + dW? — dX? — dY? — dZ? (9.17) 


De Sitter space-time is topologically S? x E' and possesses S? sections 
which are GCH’s. But every timelike curve has both a particle horizon 


Structure of Space-Time 193 


FIGURE 24. The de Sitter model. 


and an event horizon. We can represent the space-time conformally on the 
Einstein static universe as the region between two parallel spacelike sections 
(Fig. 26). This evidently gives both .4 * and % ~ as spacelike. The steady- 
state model is just the “ top-half”’ of the de Sitter model (see [97], for example), 
where we cut it at the intersection with, say, the hyperplane V = W in (9.14) 
(Fig. 27). This cut leaves a null boundary in the past, so some timelike 
curves (for example, the substratum particles) do not possess particle horizons. 


Ss! 


| 


(Lobachevsk1) 


FIGURE 25. The anti-de Sitter model. 


194 R. PENROSE 


S3 


E! 


FIGURE 26. The de Sitter model as a conformal portion of the Einstein universe. 
The Friedmann model with A = 0, k > 0 is similarly represented. 


The sections of (9.14) by V-— W= const. > 0 give the (flat) hypersurfaces 
of constant time of the steady-state model. Unlike the entire de Sitter and 
anti-de Sitter space-times, this model is not geodesically complete. 

The hypersphere (9.16) has topology S' x E° and is thus not simply- 
connected. It possesses closed timelike curves (S') which can be opened 
out by taking the universal covering manifold. Then, when represented 


delete 


FIGURE 27. The steady-state model. 


Structure of Space-Time 195 


conformally on the Einstein static universe S* x E' the “unwrapped” 
anti-de Sitter model becomes just a hemisphere of the S° times the E! 
(Fig. 28). The topology is thus just E*, but its conformal structure is globally 
different from that of Minkowski space-time. We have a timelike %, so 
every semispacelike Y has a Cauchy horizon. On the other hand, no in- 
extendible timelike geodesics possess particle or event horizons (although 
some inextendible timelike curves do. The existence of Cauchy horizons is 
connected with an interesting feature of the anti-de Sitter model. Let P be 
any point on the hypersphere (9.16) and let Q be diametrically opposite to P. 
Any plane through PQ cuts the hypersphere in a “great circle’? which 
therefore must be a geodesic. One can see without great difficulty that, in 
fact, every timelike geodesic on the hypersphere, through P, must also pass 
through Q, but spacelike or null geodesics through P go to infinity and do not 
reach Q. We may pass to the “‘unwrapped”’ anti-de Sitter space (the 
universal covering space). For simplicity, choose versions of P and Q as 
near as possible, with P < Q (Fig. 29). Choose R to be a point separated 
from Q by a spacelike geodesic. Then it is easy to convince oneself that there 
is no geodesic connecting P to R{17]. (We can get from P to R in two steps, 


FIGURE 28. The anti-de Sitter model as a conformal portion of the Einstein 
universe. 


196 R. PENROSE 


FIGURE 239. The points P and R cannot be joined by a geodesic. 


however, using a broken timelike geodesic.) The anti-de Sitter model is, 
in fact, geodesically complete, so the situation is here very different from the 
case of positive-definite metrics. The nonexistence of a geodesic from P to R 
is closely related to the fact that R lies beyond H,(I,(P)), as we shall see in 
Section I1. 

The models most often studied in cosmological contexts are the 
Friedmann models, with metric 


ds* = dt* — R? dX? (9.18) 


Here dX stands for the metric of a unit 3-sphere (k = 1), or of a Euclidean 
3-space (k = 0), or a unit hyperbolic (Lobachevski) 3-space (k = —1). The 
quantity R = R(f) satisfies 


4 3 
5 mpR~ = M = const. > 0 (9.19) 
KM 1 
2 2 
=—+- — 9. 
AR + 3 AR* —k (9.20) 


which gives an energy-momentum tensor appropriate to “‘dust”’ (that is, a 
perfect fluid without pressure): 


Top = pP(t)ta ly (9.21) 
where 


t, = Vat (9.22) 


Structure of Space-Time 197 


FIGURE 30. The different time-behaviors for the Friedmann models with A = 0. 


With 1 = 0, the solutions of (9.20) are as indicated in Fig. 30. (The time- 
reverses of these are also solutions, but not consistent, if k <0, with the 
observation that the universe is expanding.) Note that only in the case of a 
universe with positively curved (and therefore compact) space sections does 
the model collapse back to a singularity (R= 0). (By means of identifica- 
tions, the cases k < 0 can also be given compact space sections, if desired, 
at the expense of a loss of global isotropy.) 

We can represent the Friedmann models also as conformal subsets of 
the Einstein static universe. Let us take 1=0. The case k > 0 resembles 


wr ra 


FIGURE 31. The Friedmann models with & <0, A=0 as conformal portions 
of the Einstein universe. 


198 R. PENROSE 


the situation for de Sitter space-time (Fig. 26) the difference being that 
Q =o at £* instead of Q=0. The cases k <0 are indicated in Fig. 31. 
In both cases %~ is spacelike with Q=oo and ¥* is null with Q=0. 
When k = 0 we get V,Q = 0 at ¥ *, and we havea point /°. But V,Q 40 
when k <0 and 7° is an S?. All the Friedmann models possess GCH’s, 
but they yield particle horizons for all timelike curves. 

Let us finally consider two examples which do not possess GCH’s, but 
which fail to do so in rather different ways. First, we consider the plane- 
fronted wave, which has metric [15, 28, 77] 


ds* = 2{du + H(v, x, y) dv} dv — dx* — dy? (9.23) 


the ranges of the real variables u, v, x, y being unrestricted. The nonvanish- 
ing curvature tensor components for (9.23) are defined by 
67H 07H 67H 


2 


— sata 9.24 
0x? Ox dy dy oan 


The metric satisfies Einstein’s vacuum equations with 2 = 0 (and so represents 
a purely gravitational wave) if 


a 0 
(= + =a) H =0 (9.25) 


but if (9.25) is not imposed then (9.23) covers the more general situation of a 
combined gravitational-electromagnetic—neutrino wave. All seven principal 
null directions then coincide. If H is quadratic in x, y, then we get the special 
case of a plane wave. Many plane waves are geodesically complete [28, p. 96]. 

If H = 0 over some range, we have a region of Minkowski space-time. 
We may, in fact, choose H to be zero over all but some finite range of v: 
Vo <vu<v,. Then we have a “sandwich wave” [10]. An extreme case 1s 
the idealized situation in which the wave is allowed to become infinitesimal 
in duration (say v9, v, 20) while still producing a nonzero resultant effect. 
This results in the function H becoming a delta function in v: 


H(v, x, y) = d(v)h(x, y) (9.26) 


With the substitution of (9.26) in (9.23), we get a metric which does not 
satisfy the conditions normally required for a space-time with a delta function 
in the curvature, as required in (9.24), since here the metric tensor components 
involve delta functions. It is possible to make a coordinate change so that 
the new metric tensor components are C° functions of the coordinates. But 
the present form is useful since it leads to a graphic “‘scissors and paste”’ 
construction of the manifold [80]. 
Divide Minkowski space-time, with metric 


ds* = 2du dv — dx* — dy? (9.27) 


Structure of Space-Time 199 


into .“@~(v <0) and .“@*(v>0) by removing the null hyperplane rv = 0. 
Reattach a boundary r = 0 to each half but identify the two halves so that 
each appears *‘ warped ” as viewed from the other (Fig. 32). This is achieved, 
on passing from .W”~ to Ww”, by 


x7 xX yoy usu + h(x, y) (9.28) 


at the common boundary v=0. The jump in coordinates given in (9.28) 
corresponds precisely to (9.26) substituted in (9.23). 


FIGURE 32. A plane wave with a delta-function amplitude. The two half- 
spaces are flat, but joined with a warp. 


For simplicity, let us consider the purely electromagnetic plane wave 
case. Then we can take 


h(x, y) = a(x? + y?) (9.29) 
where a = const. > 0. The null cone of the point Pe .@ ~, with coordinates 
u=x=y=0, v= —a™' has the equation 

2u(v + a7!) = x* + y? (9.30) 
At v = 0, this agrees with 
2u(v —a~'!) =x? + y? (9.31) 


by virtue of (9.28) and (9.29). Equation (9.31) gives the null cone of the 
point Qe .@* with coordinates u=x=y=0, v=a'. This shows that 
the null cone of P is (in this example) exactly focused again, after passing 
through the wave, to the point Q. However, there is one exceptional null 
geodesic a, through P, namely x = y=0, v = —a™', which never hits the 
wave at all. Correspondingly, the null geodesic B through Q, given by 
x = y=0, v =a! does not emerge on the other side. We have the situation 
that the two inextendible null geodesics a, f arise as the limiting configuration 
of a sequence of single null geodesics through both P and Q. (The space of 
null geodesics, in this example, is thus not Hausdorff.) 


200 R. PENROSE 


A spacelike GCH would have to intersect each null geodesic of the 
sequence just once. Thus it would have to meet the limit pair «, B in only one 
point. It could not meet both of a, B as would be required of aGCH. In 
terms of Fig. 32, any edgeless spacelike hypersurface through P gets trapped 
underneath the past null cone of Q and never reaches B. The nonexistence 
of a semispacelike GCH is also easily established. (This particular example 
can also be represented conformally as part of the Einstein static universe. 
In this case, the nonexistence of a GCH arises, not from a spacelike 4%, but 
from the fact that contains a null geodesic in the interior region of M.) 
The more general conformally curved plane waves, such as the purely gravita- 
tional plane waves do not possess GCH’s for essentially the same reason as 
the above case [77], but the situation becomes somewhat more complicated 
Owing to the presence of astigmatism in the focusing. 

As the final example of this section, we consider the Taub-NUT empty 
universe as described by Misner [59, 61, 63, 105] (see the lecture of Misner 
Ch.VI; Geroch Ch. VIII). Here, the entire space-time M is to be topo- 
logically S° x E' (but conformally different from the Einstein universe). 
There is a particular spacelike section Y, of maximal proper volume, which 
has the metric of a 3-sphere (Fig. 33). The four-dimensional geometry at S 
does not possess the full rotational symmetry of S°, however. The Weyl 
tensor is nonzero, the gravitational principal null directions coinciding in pairs 
{22} [cf. (7.6)]._ The orthogonal projection into ¥ of these null directions 
gives us a nowhere vanishing line-element field on Y. (The two null direc- 
tions project as diametrically opposite, at each point of Y.) These line 
elements are tangential to a congruence of great circles on S? which form a 
set of Clifford parallels. Thus, these circles form a Hopf fibering of S°. 
The rotational symmetry that remains Is a transitive four-parameter group. 

Let us move our section parallelly upward, in.Fig. 33, preserving this 
symmetry. We get similar spacelike sections but where the volume of the S° 
decreases. This volume finally decreases to zero [in accordance with the 
Raychaudhuri effect, cf (7.40)-(7.47)], but instead of collapsing to a point 
we simply get H,(S ), which is, here, a compact null hypersurface, generated 
by null geodesics which constitute a set of Clifford parallels on H,(Sf) = S?. 
Beyond H,(/), the section becomes timelike?® and therefore possesses 
closed timelike curves, in a gross violation of reasonable causality. 


28 These sections are compact Lorentzian 3-manifolds of a type considered by Avez 
[3]. Any closed timelike curve y of this 3-manifold possess a curious property, namely, 
that whereas (clearly) it cannot be continuously deformed to a point while remaining time- 
like, there is (for each section) some integer ” with the property that ny (that is, y described 
with multiplicity »— for this, y needs to be a proper parametrized curve) can be de- 
formed remaining timelike into (7+ 1)y. Hence, it can be so deformed into a curve of 
arbitrarily great length. This provides a counter-example to a result of Avez [4]. I am 
indebted to W. Kundt and H. J. Seifert for pointing this out to me. 


Structure of Space-Time 201 


FIGURE 33. The Taub-NUT model. 


The future-directed null and timelike geodesics emanating from points 
of SY fall into two classes. There are those which enter the part of .@ 
beyond H,(S/) (and so cross H,(S) at a definite point) and those which 
simply spiral round and round and approach H,(/) asymptotically. In 
fact, these latter curves have finite affine length into the future, so the model 
is not geodesically complete. Misner has shown that, curiously enough, an 
alternative extension can be fitted at H,(S) and beyond, other than the one 


202 R. PENROSE 


shown in Fig. 33. The curves which previously crossed H,(/) now spiral 
round and round, while most of the geodesics which previuosly spiraled now 
cross the new H,(/) into the new extension. Both extensions are analytic, 
but can be added simultaneously only at the expense of giving up the require- 
ment that .% be a Hausdorff manifold [59, 61, Geroch lecture this volume, 
Ch. VIII]. The behavior at H_(f#) is exactly similar to that at H,(S) 
(see Fig. 33). This model appears to be highly unstable, in the neighborhood 
of Hi(S), against small perturbations in the initial conditions at Y. How- 
ever, it is instructive in that it exhibits a kind of pathology that has to be 
taken into consideration when more general situations are studied. 


10 GRAVITATIONAL COLLAPSE 


A question of great interest in relativity theory and cosmology is that 
of the existence of singularities. The usual “big bang’’ models of rela- 
tivistic cosmology are characterized by the presence of an initial state of 
infinite curvature and infinite density (cf. Fig. 30). At such an initial singu- 
larity the ordinary view of space-time as a smooth manifold would have to 
break down. Also, the existence of a particle horizon for each world line 
presents us with an essential problem. Different highly curved portions of 
the universe, which previously could not communicate with one another, 
must come together in such a way that they fit consistently. The only 
previous “‘causal’’ link between these portions was the initial singularity 
itself. We are presented with a problem of principle in addition to all the 
formidable, but essentially technical problems, concerned with the physics of 
early high-density states of these models. This is the question of initial 
conditions in an essentially singular situation. (See Prof. Misner’s lecture, 
this volume, Ch. VI; also [60].) 

It is a reasonable question to ask, however, whether the initial singu- 
larity of these models is not more a consequence of the mathematical ideali- 
zations involved, than of the actual physical situation that the model is 
supposed to describe. [In particular, it is virtually essential to postulate a 
high degree of symmetry (for example, spatial homogeneity), for an exact 
model to be amenable to detailed mathematical treatment. And with such 
symmetry present, the picture is almost inescapable that all the matter in the 
universe should, at some time, have been compressed simultaneously into a 
single point, or at least into a line or into a two-dimensional surface 
(6, 43, 47a, 99]. Possible ways out of the dilemma that have been suggested 
include the adoption of a large enough cosmological constant, the presence of 
‘‘unreasonable”’ matter (for example, negative energy densities—models with 
continual creation of matter throughout space would have to come under 
this heading), perhaps sufficient rotation, or a change in the laws of physics 


Structure of Space-Time 203 


which could have been operative at very early stages of the universe. In 
addition, there are possible quantum effects of gravitation to be considered, 
although these might only be important at “ridiculously”? high densities 
(for example, 10?° g/cc, or, at least, 10°° g/cc). But more significant in the 
first instance than any of these other considerations, is the role played by local 
irregularities of matter and curvature. In fact, near the singularity of the 
model, we really have no reason at all to expect that it should be realistic. 
The situation is perhaps clearer for the time-reversed process. If we imagine 
all the matter in the universe hurtling simultaneously to a single “central 
point”’ (or “line,” etc.) then an infinite density at that point (or line, etc.) is 
hardly surprising. If, however, we perturb slightly the motion of the in- 
coming matter, so that it is not exactly focused to one point (line, etc.) then 
we might expect that although the density could become very high, it would 
not become infinite. The possibility then may be considered of an effective 
‘““bounce”’ to the universe. Thus the presently observed expanding state of 
the universe might be regarded as the result of a previously contracting phase, 
the transition from contraction to expansion being achieved via a condensed 
but highly asymmetrical and complicated intermediate phase. 

A picture of this kind has been suggested by many authors [54]. Also 
some detailed considerations by Lifshitz and Khalatnikov [53] had suggested 
that such avoidance of singularities by small perturbations of the motions 
might indeed be possible. However, recent work has ruled out most 
possibilities of this kind, as we shall see in Section 11. In the present section, 
I shall discuss the closely related phenomenon of the singularities encountered 
in gravitational collapse. The essential similarity between gravitational 
collapse and the time-reverse of the early stages of a “‘ big bang’’ cosmology 
has been emphasized particularly by Wheeler [113]. Indeed, there are, 
perhaps, certain advantages in concentrating attention on the more local 
phenomenon of collapse, rather than on the global cosmological question. 
For we do not need to know the entire large-scale structure of the universe 
to study a local phenomenon (or do we?); if the local laws of physics are 
strongly affected by the structure of the universe as a whole (‘‘ Mach’s 
principle’’), then we might at least expect the local phenomenon to be the 
less sensitive to such changes in these laws; finally, unlike the universe as a 
whole, collapse is not something given to us just once, so we are freed to 
exploit the immense theoretical potentialities of the ‘“‘ gedanken experiment.”’ 
My own personal feeling is that the deeper aspects of the cosmological 
problem will not be solved without a simultaneously deeper appreciation of 
what is involved in collapse. 

But what is meant, precisely, by “‘ gravitational collapse’”’? The idea 
comes originally from the classical study of large spherically symmetrical 
bodies. According to the work of Chandrasekhar [21, 39], a cold spheri- 
cally symmetrical (nonrotating) body, which has reached the endpoint of 


204 R. PENROSE 


thermonuclear activity, cannot hold itself apart against gravitational forces if 
its mass is much greater than that of the sun. Such a body would therefore 
collapse catastrophically and its collapse could only be halted if by some 
mechanism it were able to throw off sufficient mass to fall below 
Chandrasekhar’s limit. In fact, certain detailed calculations (for references 
see [107]) have indicated that for initial masses between about 1.2 and 15 
times that of the sun, the extreme intensity of the neutrino radiation from the 
collapsing core may be sufficient to blow off the outer regions of the star 
and leave a neutron star at the center. A cold neutron star (which is essenti- 
ally an enormous atomic nucleus, bound together gravitationally rather than 
with nuclear forces) must actually be somewhat /ess massive than the sun 
(Oppenheimer-Volkoff limit [67]), though vastly smaller, with diameter of 
only about 10 kilometers. The gigantic explosion involved in such a process 
is thought to account for (at least) one type of supernova [107]. 

If no more matter falls into it, a neutron star could exist indefinitely. 
But what of larger initial masses? Stars with masses of the order of sixty 
times that of the sun are observed, for example. According to the above- 
mentioned calculations, the indications are that the core remains too massive 
to form a neutron star and therefore continues to collapse. Assuming that 
the exact spherical symmetry is maintained, the core would fall radially 
inward through its Schwarzschild radius r = 2m, to encounter a space-time 
singularity at the center r=0. The remaining parts of the star which are 
not blown off would then follow the core in. An observer situated at large 
distances from the star would never see the collapse to within r = 2m, how- 
ever. To him, the star’s collapse would appear to slow down and approach 
r = 2m asymptotically. . 

Before considering how physically realistic this picture is, let us exam- 
ine the solution of Einstein’s equations on which it is based, namely, the 
Schwarzschild solution. This defines the spherically symmetrical gravita- 
tional field outside the collapsing star. We have, with the usual Schwarzs- 
child coordinates, 


2 2m\~! 
ds? = (: — = dt? — (: — =") dr? — r*(d0? + sin? Odo’). (10.1) 


Let the star’s boundary be given by r= f(t). The metric (10.1) then only 
applies outside this boundary, that is, r > f(t) (with the usual inequalities 
and identifications holding for 8 and @). The star’s boundary must be time- 
like: |f’(t)| <1 —2m/r, from which it follows that this boundary never 
crosses r = 2m in (10.1) (Fig. 34). One might be inclined to infer that the 
star therefore always remained outside r= 2m. But if we calculate the total 
proper time of a particle on the surface of the star, where we assume, for 
simplicity, that the star falls freely inward, we find that this proper time is 
finite. Thus, an observer who follows the star inward must, after this finite 


Structure of Space-Time 205 


sXe 


r 


n=0 r= om r=4m -=itn 


FIGURE 34. The Schwarzschild picture. 


time has expired, either of necessity be destroyed (for example, because he 
encounters infinite tidal forces.or some other form of singularity), or else, 
find himself in a portion of the universe not covered by the coordinates 
of (10.1). 

In fact, in this instance it is the latter possibility which holds. We can 
see this if we transform the coordinates in (10.1) by introducing an advanced 
time parameter v=¢+r+2mlog(r—2m). Then the metric takes the 
Eddington-Finkelstein [27, 32] form [compare (8.14)]: 


ds* = ( — —") dv? — 2 dv dr — r*(d0? + sin” 0 do”) (10.2) 
r 


When r > 2m, the forms (10.1) and (10.2) are precisely equivalent, but (10.2) 
has the advantage that the metric covers a larger region in a nonsingular 
way. The metric in the neighborhood of the (null) hypersurface r = 2m is 
perfectly regular. It is only because the more familiar form (10.1) possesses 
a coordinate singularity at r=2m, that this hypersurface is sometimes 
referred to as the ‘Schwarzschild singularity.” The surface of the star 
crosses r= 2m at a definite value of v (although, of course, the here un- 
Suitable ¢ parameter becomes infinite) and proceeds inward toward r = 0. 
But because of the positioning of the light cones (see Fig. 35), an outside 
observer cannot see any of the empty region r<2m. More particularly, 


206 R. PENROSE 


region 
not covered 


peg B ae 


> ae 
= 

= \ / 
3 ZS 
Le 8) 

Cc ¢¥ 

3 


CG 
(0) 
sy 
XJ 
SSeLSLESeEETaghRcecuacaeteatin 
ssessec MALO D ss Miiiets sites 
SEE ( 


r=0 r=2m r=4m 


FIGURE 35. The Eddington—Finkelstein picture. 


this is because, in the empty region, r — 2m is a null hypersurface which lies 
‘“‘above”’ (that is, to the future of) the entire external region. An observer 
or signal can cross the hypersurface from the outside to the inside but not the 
other way around. The hypersurface r = 2m is an event horizon of a more 
absolute character than most of those considered in Section 9. In fact, all 
geodesics which escape to infinity have this hypersurface as their event horizon. 

If we trace the history of the star’s surface inside the r = 2m region, we 
see that it must, of necessity, encounter r = 0, for the surface must continue 
to move in a timelike direction and the null-cones are tipping over more and 
more toward the r= 0 axis. Again if we calculate the total proper time of a 
particle on the surface of the star (assuming free fall or otherwise), up until 
the point at which it encounters r = 0, we find that this is finite. But this 
time there is no hope of extending the solution further. If we evaluate the 
curvature scalars constructed from the Weyl tensor, we find that they tend to 
infinity as r approaches zero. Thus, our observer who successfully followed 
the star through r = 2m must now be torn to pieces by the infinite tidal forces 
at the true singularity at r = 0. 

This is the picture presented by Oppenheimer and Synder [68] when 
they considered the dynamics of a collapsing uniform dust cloud. In fact 
the space-time interior to such a dust cloud can also be described simply and 


Structure of Space-Time 207 


explicitly and turns out to be nothing other than (a portion of) a Friedmann 
universe (see Section 9). This again emphasizes the close relationship be- 
tween the situation of a final singularity in gravitational collapse and of a 
final (or initial) singularity for a relativistic cosmology. 

If we wish to consider the entire history of a gravitationally bound 
Oppenheimer-Synder gas cloud and not just the final collapse phase, then 
even the metric form (10.2) does not cover a sufficient portion of the empty 
region. The behavior of such a dust cloud is time-symmetrical: Like the 
Friedmann universe with k = 1 (A= 0), it starts from an initial singularity, 
expands to a maximum volume and then recontracts to a final singularity. 
Thus we may expect to join the matter-filled region to an empty, spherically 
symmetrical exterior region which is also time-symmetrical. We can, in 
fact, exhibit such a region if we use the U, V coordinates of Kruskal [50a], 
which are related to the Schwarzschild r, ¢ (in the region r > 2m) by 


V 2 r 

ge Vie | - x) | 

7 e e 1 a (10.3) 
The metric then takes the form 

ds? = f* dU dV — r?(d0? + sin? 0 do?) (10.4) 
where 

32m? 
f= (= Jer (10.5) 


The matter region is joined to the Kruskal region along a timelike hyper- 
surface. This hypersurface is described by a timelike geodesic in the 
Kruskal (U, V)-plane (Fig. 36). Thus, only the portion to the right of this 
geodesic in the diagram is to be considered in this model. The left-hand 
portion of Fig. 36 must be replaced by a spherically symmetric portion 
(= D* x E') of the Friedmann universe (9.18) whose boundary is again a 
hypersurface (=~ S* x E') generated by timelike geodesics. These two 
portions do, in fact, fit together adequately smoothly, cf. [107]. 

One drawback of the Kruskal form of the Schwarzschild metric is that 
(10.4) is not given ‘‘explicitly,’’> depending on the solution, for r, of (10.3). 
The Eddington—Finkelstein form (10.2) is, however, much simpler, and it is 
interesting that the main features of the Kruskal extension can be obtained 
simply from (10.2). To achieve this, we note that (10.2) covers the two 
portions A, B of the Kruskal diagram (Fig. 36) (with 2mV? = e’/?"). By 
using a retarded parameter wu [cf. (8.11)] in place of the advanced parameter r, 
we can likewise cover the portion A, D of Fig. 36 with another “coordinate 
patch,” the overlap region being A. In an exactly similar way, we can cover 
the portion C, B or the portion C, D with a “‘coordinate patch”’ having the 
form of (10.2). The coordinate transformations in the overlap regions C and 


208 


boundary of 
S/p collapsing star 


Or 2s 
ur 
// LPs IN \ 


boundary of 
collapsing dust cloud 


FIGURE 36. The Kruskal picture. 


R. PENROSE 


D can be obtained, if desired, by transforming back to the form (10.1) (with 


0 <r<2m) as an intermediate stage. The one feature of 


the Kruskal 


extension which is not obtained by this method is the regular nature of the 
space-time at the 2-sphere U= V=0. However, the method has a con- 
siderable heuristic value in that it applies at once to certain other more 
complicated extensions, notably those of the Kerr and Reissner—Nordstr6m 


solutions [cf. (10.18), (10.20)] [12, 18, 37]. 


In this connection it is useful to 


have also the version (Fig. 37) of the Kruskal diagram in which conformal 


singularity 


FIGURE 37. The Kruskal picture with conformal infinity represented. 


Structure of Space-Time 209 


infinity is represented (for example, use coordinates p = tan™! sinh™! V, 
g = tan~' sinh! U). 

Let us return, now, to the case of the collapsing star. Can the descrip- 
tion that I have given be regarded as at all realistic? For example, even if 
we retain the exact spherical symmetry, can it be said that we really know 
enough about the properties of matter at the density (somewhat greater than 
that of an atomic neucleus, for the cases considered) at which the star is 
supposed to fall through its ‘‘Schwarzschild throat”? And might it not be 
that for some reason, a massive star inevitably ejects sufficient material, as it 
nears r= 2m, that its mass invariably falls below the Chandrasekhar or 
Oppenheimer-Volkoff limit, thus enabling a stable self-supporting state to 
become possible? It seems that for many years, the prevailing belief had 
been that a collapsing star would always be able to save itself by such means. 
This may have been partly because astronomers did not feel the need to 
consider self-supporting masses that were enormously greater than that of the 
sun. But with the discovery of the quasi-stellar objects (see, for example, 
[90]), renewed interest in the subject of gravitational collapse has been 
stimulated. For, considering the fantastic quantities of energy that these 
objects appear to be emitting, and their remarkably small size, it has been 
suggested that “‘ individual” masses of the order, say, 10° or 108 times that of 
the sun may be involved. Now the Schwarzschild radius of a spherically 
symmetric body is proportional to its mass. Thus the characteristic density 
at which such a body crosses r = 2m should be inversely proportional to the 
square of its mass. For an object of 10° or 10® solar masses, this density is 
not unreasonably high. Indeed, Fowler has emphasized that for an object 
of 10'* solar masses (such as a good-sized galaxy) this characteristic density is 
less than that of air! And there seems no reason at all to believe that such an 
object would necessarily eject virtually all its material before reaching r = 2m. 

Then what about the role played by deviations from spherical sym- 
metry? For example, can one even define the analog of r= 2m for an 
asymmetrical body? And might it not be that the presence of rotation 
always prevents the final collapse? The question of rotation is a particularly 
pertinent one. For example, if differential rotation is present, then equili- 
brium states exist for objects of very much larger mass than Chandrasekhar’s 
limit [69]. However, with viscosity present, only uniform rotation could 
represent a possible final state. Stability of such bodies is a crucial con- 
sideration [21a], and it appears that under suitable circumstances a rotating 
body can collapse. Now, fortunately, there is a known solution of Einstein’s 
vacuum equations, namely that of Kerr [12,48], which generalizes the 
Schwarzschild solution by the inclusion of angular momentum. The Kerr 
solution contains two arbitrary parameters m and a, where m defines the 
mass as in Schwarzschild’s solution, and where ma defines the angular 
momentum. The solution [given explicitly in (10.18)] is still of a rather 


210 R. PENROSE 


special character (for example, the quadrupole moment has to be given by 
2ma’) but it seems likely that at least the main features of the geometry of 
the space-time surrounding a rotating collapsing body can be gleaned from a 
study of Kerr’s solution. One of the most significant features of the solution, 
in fact, is that if a > m there is no analog of the ‘“‘ Schwarzschild throat,”’ that 
is, there is no event horizon for external observers, which would prevent 
signals from the interior regions being received outside. On the other hand, if 
a < m, there is such an event horizon and the solution qualitatively resembles 
the Schwarzschild solution in that it possesses a similar “‘ throat’? which can 
finally swallow up the material of the (super-) star. There are other features 
of the solution which qualitatively differ, however, from the case of the 
Schwarzschild solution. Some of these will be discussed shortly, but in the 
meantime, I shall concentrate on the question of the “‘throat”’ itself and give 
a definition which may be used as a criterion for its existence. 

Let us return to the Schwarzschild solution. For convenience, con- 
sider the metric in the form (10.2) and examine Fig. 35. We wish to single 
out a property of the region B which characterizes it as ‘‘ perculiar”’ in some 
sense. It should be emphasized, however, that there is nothing Jocally 
peculiar about the B-region. A small neighborhood of any point in the 
B-region is as good a solution of Einstein’s vacuum equations as any other, 
and there is nothing at all ‘“‘singular’’ about it. But it seems that in some 
way the whole B-region is “shrinking” in an apparently inevitable way to 
the space-time singularity at r=0. To express this ‘‘shrinkage’’ we need 
to single out a partly global aspect of the B-region—but it must not be too 
global because we require a property that will not be destroyed by small 
perturbations in the metric at the “‘time”’ of collapse. Consider, then, a 
point 7 in the B-region of Fig. 35 which can lie just inside r= 2m (but 
outside the matter region). Such a point describes a spacelike 2-sphere 7 
in the whole space-time, of total surface area 4nr2(r < 2m). Now any 
system of material particles whose world-lines intersect Z must have a 
subsequent motion which is bounded by the velocity of light. That is to say, 
if we consider the extreme situation of a flash of (idealized) light emitted at 7, 
there will be an “ outgoing” flash and an “‘ingoing”’ flash, described by the 
two null geodesic segments in Fig. 35 with past endpoint 7. These two null 
geodesic segments describe null hypersurfaces through 7, which together 
constitute /,[Z]. Now the significant feature is that both parts of the 
boundary are “shrinking.’’ Not only is the total surface area of the sphere 
described by a point moving up the “ingoing’’ null geodesic decreasing, but 
so also is the total surface area correspondingly decreasing for the ‘‘ out- 
going” flash. This, then, is the property of ZF which I shall use to charac- 
terize it as a trapped surface [75]. The presence of a trapped surface in a 
region will then be our indication that there may be something “ peculiar” 
about the subsequent history of the region. 


Structure of Space-Time 211 
To be more precise: 


A TRAPPED SURFACE Y IS A SMOOTH COMPACT, SPACELIKE 
2-SURFACE?” WITH THE PROPERTY THAT THE NULL GEODESICS, 
WHICH MEET 7 ORTHOGONALLY LOCALLY CONVERGE IN FUTURE 
DIRECTIONS. (10.6) 


The quantity p of Section 7 conveniently measures thic convergence. The 
null geodesics meeting a spacelike 2-surface orthogonally always generate a 
null hypersurface, so by (7.28) we have p= pp. Thus, the characteristic 
feature of a trapped surface is that 


p>0O aT F (10.7) 


for both the null hypersurfaces through 7. As we move into the future up 
either of these null hypersurfaces, the surface area decreases, initially, at 
every point of 7. 

The idea, here, is to establish a connection between the existence of a 
trapped surface in a space-time and the subsequent development either of a 
singularity in space-time [such as r = 0 in (10.2)] which may involve infinite 
curvature, or else of some other space-time behavior which would be very 
strange from the physical point of view. But before considering the rigorous 
mathematical result, let us ask the question whether it is reasonable to expect 
trapped surfaces to develop at all in our actual universe. For any individual 
case of a collapsing star or superstar, this is clearly a matter of the detailed 
astrophysics involved. The amount of rotation present, the mass loss via 
neutrinos, through gravitational radiation, the exact nature of the asymmetry 
present, the nature of the equations of state, magnetic fields, etc.; all these 
may be relevant. But the essential point I want to make here is that there 
can be no reason of principle against a trapped surface developing. To 
illustrate this, I shall consider the following ‘‘gedanken experiment.” 

A technically advanced (but presumably exceedingly foolhardy) race 
of beings inhabit a galaxy (preferably elliptical) which contains about 101? 
stars. By means of rockets, these beings contrive to alter the velocities of 
the stars, so that there remains practically no transverse component to the 
stars’ motions. Furthermore, their radial motions are adjusted so that all 
the stars will fall toward the center so as to reach its vicinity at almost the 
same time. (The mass-energy expended in this operation would be small 
compared to the mass of the stars.) The region they have to aim for has 
roughly a radius 50 times that of the solar system. There is plenty of room 
in a volume of this size for all the stars, and they can arrive there before any 
serious problem of stellar collisions arises. (They can even steer to avoid 


29 It is convenient also to assume that 7 is a semispacelike set, but this is not 
strictly necessary for the proof of Theorem I. 


212 R. PENROSE 


collisions, if necessary!) But does this result in the development of a trapped 
surface? The fact that it does depends on the most primitive considerations 
of general relativity. We consider a flash of (idealized) light emitted at the 
center at about the time the stars enter the critical region. Because of the 
focusing effect on idealized light rays [7, 87] (that is, on null geodesics) by 
the gravitational field of each star (which is actually an observed effect of 
general relativity), a sufficient density of light rays which started off by 
diverging from the center will pass through (and near to) enough stars so 
that they start converging again. Under these circumstances it can be shown 
that a topological sphere which lies just inside the outermost boundary of 
the light rays will indeed be a trapped surface. 

This example serves to illustrate, also, the fact that there is no reason 
why an observer should be “‘ destroyed ”’ as he enters the critical region. The 
curvature is still extremely small in the neighborhood of the trapped surface 
and the space-time there is perfectly regular.°° Indeed, locally there is 
nothing peculiar about the trapped surface itself. Even in Minkowski 
Space-time, many surfaces exist which are “locally trapped.” That is to say, 
they satisfy the conditions of (10.6) except for the fact that they are not 
compact. To construct such a surface, we need only consider the inter- 
section of the past null cones of two spatially separated points. 

If, then, it is accepted that trapped surfaces can exist, what of the 
consequences? I shall show, in fact, that if we assume (provisionally) that 
the universe is open (that is, not spatially compact), then our present know- 
ledge of the laws of physics is insufficient to enable us to calculate even in 
principle the future history of such a collapsing system. 

To predict the future of a system, we would normally ask that adequate 
Cauchy data be specified on some spacelike hypersurface. We may take it 
that the relevant local laws of physics are such that effects can be propagated 
along timelike or null curves, but not along spacelike ones. Thus, if data 
are specified on some semispacelike closed set Y, then this can determine the 
situation only in D,(f) and D_(f). For regions exterior to these, we 
could have relevant information carried along nonspacelike curves which 
fails to “‘register’’» on SY. Now a collapse situation can be set up starting 
from perfectly reasonable data on a spacelike hypersurface @, which (since 
we are trying to consider a local physical situation, rather than a cosmological 
one) we attempt to think of as effectively extending to infinity whether or not 
the universe is actually spatially noncompact. Thus, for predictability, we 
ask that a noncompact spacelike hypersurface @ should exist which is a GCH 
for .@, or, more appropriately, for which D,(@) = 1,[@]. The question of 
whether the cosmologial nature of space-time is actually relevant to a local 
collapse situation is not a trivial one, as we shall see later. 


3° In [26], slightly aspherical collapse is examined and it is verified that trapped 
surfaces evolve. 


Structure of Space-Time 213 


Theorem I. The following requirements on a space-time®! .# are mutually 
inconsistent: 


(Ia) There exists a noncompact spacelike hypersurface @ for which 
D,(@) =1,[@] 

(Ib) There exists a trapped surface 7 </,[¢ ] 

(Ic) R,,/7/° <0 for each null vector /° 

(id) .@ is future-null complete 


99 


In (Id), “‘future-null complete’’ means that every null geodesic in .@ can 
be extended into the future to arbitrarily large values of a (given) affine 
parameter. We may think of this physically as the statement that “ photons 
(or neutrinos or gravitons) cannot just disappear.”” This seems to be a very 
reasonable physical requirement on a space-time. In Section 11, we have to 
consider another type of completeness, namely, timelike geodesic complete- 
ness (“‘ particles in inertial motion cannot just disappear’’). Yet other types 
are possible, such as “‘ bounded acceleration completeness’’ which states 
that all inextendible timelike curves of bounded curvature have infinite 
total length®? (that is, ‘‘ particles experiencing limited forces cannot just 
disappear’’). These types of completeness are all inequivalent for suitably 
contrived examples [35, 52]. However, it probably does not make much 
physical difference which definition is used. It is possible that the definitions 
are only inequivalent in the presence of regions of arbitrarily large curvature. 
To establish the existence of such regions would really be nearer to the object 
of results such as Theorems I, II, III, in any case. Unfortunately no result 
which directly predicts the existence of such high-curvature regions, on the 
basis of “‘reasonable” and “ generic’’ physical assumptions, has yet been 
established, although we may think of Theorem I, and particularly Theorems 
II and III, as strong indirect indications of this; cf. Geroch lecture Ch. VIII. 

The condition R,,/7/° <0 is simply (7.35), and is a consequence of 
Einstein’s equations (7.1) (with or without the A-term) and the positive 
definiteness of the energy density (7.34). (In an “‘eigentetrad”’ of 7,,, this 
positive definiteness means that To) 20, To9,9 + 7;; 29, To9 + T22 2 9, 
Too + T33 20.) This is very desirable physically, especially since a violation 
of this inequality seems to lead to a serious problem in connection with 
quantum field theory, namely, an apparent catastrophic instability of the 
vacuum. But the issue is not completely clear-cut. 


3! Space-times considered here are always assumed to be time-orientable (cf. Section 
5). However, all the results here (for example, Theorems J, IJ, II) will also apply in a 
Suitable form without this assumption. For any non-time-orientable Lorentzian manifold, 
there always exists a double covering manifold which is time-orientable. (Consider the 
space of all null half-cones on the manifold. This gives the desired double covering, 
cf. [55].) 

32 The term “length’’ will often be used here to denote “‘ proper time’’ for a timelike 
curve. 


214 R. PENROSE 


Before we proceed to the proof of Theorem I, it will be useful to estab- 
lish some lemmas concerning the structure of a SSB. We recall [cf (9.4)- 
(9.6)] that any SSB is a subset Y of -@ of the form 

S =1,[#] or, equivalently, f = I_[Y] (10.8) 

where 4, Yc mM. Let us divide ¥ into four disjoint parts (some of which 

may be vacuous), Sy,44,A7_, and S,, defined as follows. If XE PS, 
then there may or may not exist points Y, Z € Y, distinct from X, for which 

I(x) co T4(Y) I_(X)cI_(Z) (10.9) 


the different possibilities defining the subsets Ay, Y,, A_, Ao according 
to the scheme 


(1Z) (AZ) 


(10.10) 


The intuitive meanings of these subsets is that Wy represents the portion of 
SF which is null, where Y , and S_ consist, respectively, of the future and past 
end-points of Ly, and that SY, represents the spacelike portion of f 
(see Fig. 38). A more precise statement is contained in the following lemma, 


S _ 


FIGURE 38. The different portions of a semispacelike boundary. (Null 
directions are inclined at 45°.) 


which also gives a somewhat more useful condition for a point to lie in 
F iscF 43 OF Fs 


Lemma I, Let X € S, where ¥ is an SSB given as in (10.8). Let 2 be an 
open set containing X. Then 


(a) [,(X) cI,[4# —2] implies Ye Pyu Sx 
(b) J_(X) cI_[Y — 2] implies Ye Sy S_ 
Furthermore, a null geodesic segment on # 


(c) passes through X if Xe Sy 
(d) has X as future end-point if Ye SF, 
(e) has X as past end-point if ¥ e S_ 


Structure of Space-Time 215 


To prove Lemma I, assume, first, that /,(Y) </J,[# — 2]. Let @ bea small 
normal coordinate ball with center X¥ and boundary #4, where 4c 2. 
Consider a sequence (X,,) of points, with X, € /,(X) © & which converges to 
X. For each X,, there exists W,e€ # — 2 with W, <X,. The timelike 
curve from W, to X, contains a connected portion in &, with past end-point 
Y, € @and future end-point ¥,. Now # is compact, so there is a point Y on 
% such that a subsequence of (Y,) converges to Y. The join of Y, ¥ in the 
normal coordinate system Is a geodesic k, through XY. It is readily seen that k 
cannot be spacelike (for small #). Also k cannot be timelike, since then we 
should have Y< X, so W,< Y,< X for some a, violating X ¢/,[% ]. 
Thus, k is a null geodesic. Hence /,(X¥)c/,(Y). Now, Y,6€/4[#], so 
Yel,[#]}. Also Y¢/,[#], since X¢/,[4], so that YeY. This 
establishes (a), (d) and half of (c). The proof of (d), (e) and the other half 
of (c) is exactly similar. (That the two geodesics thus obtained for (c) must 
coincide, follows since Y is semispacelike.) 

When two null geodesic segments on ¥ have a point X in common, this 
can only happen at Ye YF, or YE LY_. This result also holds in the case 
of “‘ infinitesimally neighboring”’ null geodesic segments on Y. The follow- 
ing result makes this more precise. 


Lemma IT, Let S be an SSB and k a null geodesic segment lying on Y, with a 
future [resp. past] end-point P. Suppose some open set 2, containing k — P, 
meets Y in a smooth null hypersurface W for which the convergence p 1s 
unbounded near P (choosing the tangent vector /* to k smoothly at P). 
Then Pe #, [resp. S_]. 


The proof of Lemma II that I give depends on another result (actually more 
detailed than is necessary): 


Lemma III. Let k be a null geodesic segment in .@ and let WY,, WV, be 
smooth null hypersurfaces through k for which YW, C1,[NVM%,]U W,. Then 
P2— Pp, = lo, —0,| on k (p;, o; refer to W,, with the same /* for each WV; 
at k;i=1, 2). 


To prove Lemma III, consider two smooth scalar functions u,(i = 1, 2) on 4, 
where u, = 0 defines 1, (in some neighborhood of k), with u, increasing 
toward the future with nonvanishing gradient on W,. We can scale the 
u,’s so that V,u, = V,u, =/,0nk. Inasmall enough neighborhood of k we 
have u,>u,. It follows that X¥°V,(u, — u,)+4X7X°V,V,(u, — u,) > 0, 
on k, for each set of components (X*) with |X°| small enough. Let Vj,u, = 
liija> SO layg = [(2yg = 1, 0n k. Choosem,, onk, complex, null and orthogonal 
to |,, with m,m?= —1. Putting X° = Am* + Am’, we get 17(o, —0,) + 
2AX( p> — py) + A*(G, — G,) = O [cf. (7.33)]. This must hold independently 
of A, and p, — p, = |o, — ¢,| follows. 


216 R. PENROSE 


To prove Lemma II, suppose Pe Sx (with P a future end-point of k). 
Then, & can be extended into the future on # toa point Re Y. The past 
null-cone [_(R) is nonsingular in some open set Y with K—PcGc 2 
(for R near enough to P). Lemma III now applies with YW,=@9n [_(R) 
and V,=9ON%. But if p =p, is unbounded near P (and positive, since 
® is not unbounded at P), then p, — p, must become negative near P, violating 
the inequality in Lemma III. Thus Lemma II is established. 

One immediate consequence of Lemma I Is that if 7 is any closed 
subset of .@ and 


FY=1,[7] (10.11) 
then, 
SIP oT (10.12) 


This follows from part (a) since if Xe SY — F, some neighborhood 2 of X 
will not meet 7,so # —2=H, with H = F. From (10.12) and parts 
(c) and (d), it follows that each point of YS — JF 1s the future end-point of a 
null geodesic segment on FS. As particular cases, this result applies to 
I ,(P) or to any particle horizon. We may also take the time-reverse of the 
result, so if f =I _[Z ], then each point of Y — T is the past end-point of a 
null geodesic segment on “. This applies, in particular, to /_(P) or to any 
event horizon. 

Another consequence of Lemma | concerns the structure of Cauchy 
horizons. The result is not needed for Theorem I, but will be required in the 
next section. There are two simple ways of obtaining any Cauchy horizon 
as a part of an SSB. Let # be closed and semispacelike. Then the set 
int {D_(#) U1,[#]} is the union of the /,(P) with Pe D_[#]. The 
boundary of this set is an SSB which contains the Cauchy horizon H_(#) 
as part (that is, as the intersection with D_(#)). This is the first way. The 
second way can be described as follows. Let # be closed and semispacelike, 
as before, and define [42] 


W_(#)=1(#) — D(#) (10.13) 


(We can define W,(#) correspondingly.) Then Pe W_(#) if and only if 
P is both the past end-point of a future-inextendible timelike curve x not 
meeting #, and also the past end-point of a timelike curve f with future 
end-point on # (Fig. 39). Clearly W_(#) is open. Also, if Q <P and 
Pew_(#), then Oe W_(#), since # is semispacelike. Hence, 


W_(#) = 1_[W_(3)] (10.14) 
Thus W_(#) is an SSB. _ It is readily verified that 
H_(#) = W_(#) AD_(#) (10.15) 


[use (9.5)], so again, our Cauchy horizon 1s obtained as a part of an SSB. 


Structure of Space-Time 217 


FIGURE 39. A semispacelike boundary containing the past Cauchy horizon 
of #. 


Now suppose Xe W_(#), but X ¢ edge (.# ) [cf. (9.13), etc.] Either 
an x-curve or a f-curve with past end-point XY must exist (trivially— 
where we define a degenerate f-curve to exist whenever X € # ) but not both, 
since X¢W_(# ). If it is the x-curve that fails to exist, then Ye H_(H#); 
if it is the B-curve, then Xe /_[4%]—#. Suppose ¥ e H_(.%), soa f-curve 
(possibly degenerate) exists from X. Then we can choose a small open 
neighborhood 2 of X so that every point of 2 A J_[-#] is the end-point of a 
B-curve (that is, /_[#“]A2= 4002). For if Xe#, then, since X ¢ edge 
(#), there exist points A, B near X (with A < X < B)so that every timelike 
curve from A to B, in the neighborhood of X, meets #. In this case we 
choose 2 < /_(B)N/,(A). If X ¢.#, then choose 2 within the past of the 
f-curve from X. Any point U of 2q/_(X) isin W_(# ), so U is the past 
end-point of an x-curve, which must meet 2Q/_[#]in V, say. Now V is 
the past end-point of a f-curve also, so VEZNW_(H)CW_(#H) —2. 
Thus Lemma I condition (b) is satisfied, whence: 


EVERY POINT OF H_(# ) — edge (4% ) IS THE PAST END-POINT 
OF A NULL GEODESIC SEGMENT ON H_(¥ ) (10.16) 


Of course, the same applies also to /_[#]—.#. Note that if H_(.#% ) does 
not meet edge (.#), then every point of H_(.#) is the past end-point of a 
future-inextendible null geodesic on H_(.# ) (since extending the null geodesic 
into the future, we can never reach a future end-point on H_(.# )). This 
applies [cf. (9.13)] when # is an edgeless semispacelike closed set. The 
results will clearly also apply in the time-reversed versions. 

Another consequence of (10.16) is that if # is a semispacelike set which 
intersects every maximal null geodesic of -W in a nonvacuous compact set, then 
KH isaGCH for .d@. For, if # were not a GCH, it would possess a non- 
vacuous Cauchy horizon, a null geodesic generator of which would have to 


218 R. PENROSE 


meet % in a compact set. An end-point of this set would have to lie on 
edge (#%), by (10.16). However edge (.# ) must be vacuous. To see this, 
consider [_[%]. Now # cI_[#] since # is semispacelike. Suppose 
Pel_[#] but P¢é #. A null geodesic generator y, of /_[.#], passes 
through P and contains a separate compact portion of #. Let Q lie on y 
between P and this portion. Any null geodesic y through Q, distinct from y 
can meet /_[.% ] only at QO. Thus y could not meet .#, so P does not exist. 
We have # = 1_[#], so # is an SSB; # is therefore edgeless and the 
result is established. 

As a converse, we note that if # is any semispacelike set and 
Peint D_(#) then any future-inextendible null geodesic y through P must 
meet #. This follows because a point 0 € D_(# ) exists with QO <P, so if 
y did not meet .# we could follow just below y with a future-inextendible 
timelike curve, contradicting Qe D_(#). Similarly, we have the time 
reverse of this result that any past-inextendible null geodesic through any 
point of int D,(.7 ) must also meet #. This will be required in a moment. 


PROOF OF THEOREM I. Set A =/,[Z7]asin(10.11). Then if Pe SY — FJ, 
there will be a null geodesic k, a segment of which lies on Y and has P as a 
future end-point. Extend the geodesic maximally into the past. It hits @ 
because Pe int D,(%). Since Y(</,[G ]) is closed, there is a point Re SF 
which is a past end-point of An F. This must lie on S_, so Re J. 
Furthermore & must meet the spacelike 2-surface 7 orthogonally at R, 
that is, lie on one of the two null hypersurfaces which intersect locally in 7, 
since these represent the local boundary of /,[7 ]. Thus, # is generated 
by the null geodesic segments k which are orthogonal to 7 ata past end-point, 
and which may have a future end-point (where they encounter 7,). In fact, 
any such segment k must have a future end-point by Lemma II, because the 
trapped surface condition (10.7), together with the focusing property 
[cf. (7.39)] and null completeness (Id), imply that p becomes unbounded at 
some first point QO, on k, or on the extension k of k into the future. (In fact, 
the second possibility is the more usual one, corresponding to the occurrence 
of a crossing region of S, which generally occurs before the caustic is reached, 
where p = 00.) Now Q varies continuously as k varies [cf. (7.29), (7.33), 
etc.] so the finite segments RQ of k generate a compact set. Thus , being 
a closed subset of a compact set, is also compact. So, we have (9.7) that: 


SF IS A SEMISPACELIKE COMPACT TOPOLOGICAL 3-MANIFOLD IN & (10.17) 


Now, it is a well-known theorem (see, for example, [10I, p. 201]) that any 
space-time admits a smooth timelike unit vector field. We can use the 
integral curves of this vector field to achieve a one-to-one mapping of ¥ into 
€ (both sets are semispacelike) since, by (la) Ac D,(¢@). This is im- 
possible since the dimensionalities are the same but # is compact and @ is 


Structure of Space-Time 219 


noncompact. Thus (la), (Ib), (Ic), (Id) are mutually inconsistent and 
Theorem I is established. 

In the face of Theorem I, we are led to ask which of the conditions is 
most likely to break down in the actual universe. The most immediate 
candidate 1s perhaps (la). There are two ways that (la) can fail to be satis- 
fied. In the first instance, the universe might be ‘‘closed”’ (that is to say, 
spatially compact). Then we might envisage replacing (4) by a compact 
spacelike hypersurface which is a GCH (or with /,[@ ] = D,(@)). The proof 
of Theorem I would then break down at the final stage. Indeed, the result 
would no longer be true, because (for example) all the conditions would be 
satisfied by the de Sitter space (9.14), (9.15), with @ as the section of (9.14) by 
V = —2a < 0 (a const.) and J as V= —a, W=0. On the other hand, it 
seems hard to believe that the question of whether the universe, as a whole, is 
open or closed should seriously affect the discussion of a “*local’’ collapsing 
system (which would be smaller by a factor of at least 10''). Indeed, an 
examination of the proof of Theorem | shows that the argument can still be 
made to carry through in the case of a compact © unless, in a well-defined 
sense, the collapsing object eventually ‘“‘swallows up” the entire universe! 
However, a closed universe (like the Friedmann model with k = 1) which 
eventually recontracts to a highly condensed state could, in principle, satisfy 
this condition. But the elimination of the singularity problem by this means 
seems to be ruled out by a theorem of Hawking [41], which states, in effect, 
that ‘‘almost all’’ space-times with a compact GCH must be incomplete if 
they satisfy the slightly stronger (but still *‘reasonable*’) condition on R,, 
implied by (7.47). 

The second way in which (Ia) might be violated is a more serious 
possibility. The ‘‘ Laplacian” idea that the future of the universe should be 
completely determined by its behavior at one “time” predates both relativity 
and quantum theory. We have become accustomed to nondeterminism in 
the latter theory, so why should we still demand determinism in all cases for 
general relativity? The possibility that the universe should not possess a 
GCH is an intriguing one. There are a number of physically interesting 
space-times which do not possess GCH’s. Two examples were considered in 
Section 9. Other examples which are more relevant to the question of 
collapse (although, possessing singularities) are the Kerr [48] and Reissner- 
Nordstr6m (see, for example, [62]) space-times, as extended, respectively, by 
Boyer and Lindquist [12] (cf also [18, 19]) and by Graves and Brill [37]. 
The Kerr metric 1s 


2 
ds? = dv? — — (dv +a sin? 0 dg)? —2 dv dr + hd? 


+ 2a sin? Odrdgy +(r? +a’) sin? Odg* (10.18) 
where m and a are constant (mass m, angular momentum ma) and where 


h =r? + a* cos? 0 (10.19) 


220 R. PENROSE 
The Reissner—Nordstr6m metric (with advanced time v and electric charge 
defined by e) 1s 
2 2m |e? 2 274192 4 cin2 2 
ds* = [1 ———+-] dv* — 2 dv dr — r°(d0* + sin* 0 de’) (10.20) 
pF 
The metric (10.18) with m7 > |a| and the metric (10.20) with m > |e| are 


somewhat similar in that they possess not only an event horizon for external 
geodesics (Schwarzschild “throat’’), given, respectively, by 


r, =m+4+(m? —a’)!/? r, =m+(m? —-e?’)'? (10.21) 
but also a second “‘ horizon”’ at which the null cones tip back again: 
r_=m—(m —a’*)'/? r_=m—(m? ~ e’)' (10.22) 


Both metrics have true singularities at r= 0, but these differ somewhat in 
nature. In the case of the Kerr metric the singularity has a ring structure. 


Yn 
a <a 
3 \ 
< 
> 
x 
7 
{> 
>> 
= 
7 
(> 
r=0 r=r_ F=f; 


FIGURE 40. The Kerr picture (applicable also to the Reissner—NordstrOm 
metric). 


Structure of Space-Time 221 


By passing through the ring one enters a region with r<0. (Then, as 
Carter [19] has shown, closed timelike curves are encountered near the ring.) 
Ignoring the nature of the metrics near (and through) the singularities, the 
situation is as in Fig. 40. By patchwork similar to that in the Schwarzschild 
case, but here more complicated, a ‘‘ maximal” analytic extension is obtained 
(Fig. 41). The shaded portion in Fig. 41 represents the whole of Fig. 40. 
The entire Fig. 41 diagram can be pieced together (except for some exceptional 
central points) using overlapping patches isometric with the shaded region. 


region the “other side’ of the - #% 
ring in the Kerr solution 7 ; 


singularity ° 


star's boundary ; 
(solution below this °°. : EEN — observer travels to new 
line of no relevance) He . portion of universe not causally 
defined by the space-time at ¥. 


+ , he > 1° 
‘Sx 


singularity 


FIGURE 41. The Carter-Boyer-Lindquist picture (incorporating that of 
Graves-Brill). 


222 R. PENROSE 


(For this, one needs to be aware of a discrete symmetry of the metrics, which 
is not manifest in the forms (10.18), (10.20), cf. [12].) 

Now, the significance of all this is that by following the star inward and 
then giving himself an acceleration outward after crossing r = r,4, an observer 
can find himself in a portion of the universe not causally determined by 
information on an apparently reasonable initial hypersurface #. In fact, 
he crosses H,(#). The region he enters actually contains a singularity 
which he can ‘‘see.” But of course, since he has crossed H,(#), we 
cannot guarantee that the space-time he enters 1s in fact the one that we have 
constructed for him. In effect, ‘‘another universe’? has joined on to the 
universe he started from. The particular continuation we have chosen was 
obtained here by an analytic continuation rather than by the use of field 
equations. (But even analytic extensions do not ensure uniqueness globally. 
A striking example occurs with the Taub-NUT space-time described in 
Section 9, but also in flat space-time analytic extension does not imply 
uniqueness. The difficulty is that one does not know, in general, when to 
identify events reached in different ways.) Even if we can avoid the singu- 
larity in the ‘“‘ new universe”’ there is a problem of principle here, because, like 
in the early stages of a ‘‘ big bang”’ cosmology, causally disconnected regions 
have to come together in such a way that they “fit.” A striking example of 
this occurs with the Reissner—Nordstr6m solution, since the “‘new universe ”’ 
has to possess new charged particles of just the right total charge. (This 
follows even independently of the symmetry.) 

There is a further difficulty confronting our observer who tries to cross 
H,(#). As he looks out at the universe that he is “leaving behind,” he 
sees, in one final flash, as he crosses H,(#), the entire later history of the 
rest of his “‘old universe.’ It is here, in fact, that cosmological questions 
might be very relevant to him. If, for example, an unlimited amount of 
matter eventually falls into the star then presumably he will be confronted 
with an infinite density of matter along “H,(#).” Even if only a finite 
amount of matter falls in, it may not be possible, in generic situations to avoid 
a curvature singularity in place of H,(#). This is at present an open 
question. But it may be, that the place to look for curvature singularities is 
in this region rather than (or as well as?) at the “center.” In this connection, 
an example due to Bardeen [5] (cf. [23a, 66a]) of a collapsing charged dust 
cloud is of interest. Here no singularity occurs within D,(#), although a 
charge singularity has to be created in the “‘new universe”? beyond H,(#) 
(see Fig. 41). Whether any reality can be attributed to this situation in which 
a “‘new universe”’ is created out of a collapse is an intriguing question. We 
might view (10.17) as an indication that a trapped surface results in a spatially 
compact universe ‘‘ budding”’ off the original one. However, all this rests 
on the supposition that H,(#) can survive a generic perturbation on #, 


Structure of Space-Time 223 


without curvature singularities arising. We should also bear in mind that 
in any case it seems unlikely that curvature singularities can be avoided, in 
view of Theorem III of the next section. 


11 SINGULARITIES IN COSMOLOGY 


This section is devoted to two important theorems due to Hawking 
[40, 42]. The arguments will be given here partly in outline. For further 
details, the reader is referred to Hawking’s original papers. The aim of the 
theorems 1s to show that on the basis of present physical understanding, there 
are (or were) very peculiar regions of space-time, probably involving enor- 
mous curvatures, at which we would have to expect a local physical behavior 
very different from that to which we have become accustomed. The interest 
of Theorem II will lie, perhaps, mainly in the fact that no causality assump- 
tions whatever are made. It applies only to a spatially compact universe 
satisfying a condition (an inequality) which might possibly apply to the actual 
universe, but which would be virtually impossible to verify observationally. 
Theorem III, on the other hand, contains an analogous condition which, 
according to Hawking and Ellis [44], is likely to be satisfied on the basis even 
of present-day astronomical observations. In time-reversed form, Theorem 
III will also apply to suitable situations of gravitational collapse (such as a 
mildly perturbed Oppenheimer-Snyder collapsing dust cloud (see Section 10). 
A causality assumption is required for Theorem III, but of a much less 
severe kind than the global Cauchy hypersurface condition (la) of Theorem I. 
In effect, closed (that is, ~S') timelike curves are to be excluded, but we shall 
require a Slightly strengthened version of this. 

Let Pe M, then the following conditions on .@ with respect to P are 
equivalent [42, 50]: 


ARBITRARILY SMALL NEIGHBORHOODS OF P EXIST WHICH EACH 
INTERSECT NO TIMELIKE CURVE IN A DISCONNECTED SET (11.1) 


that is, roughly speaking, timelike curves from the neighborhood of P cannot 
leave and then return to the neighborhood of P; 


IF P< Q AND EVERY POINT OF J/_(Q) CHRONOLOGICALLY PRE- 
CEDES EVERY POINT OF J,(P), THEN P = Q (11.2) 


(cf. Section 9); or 
THE TIME-REVERSE OF (11.2) (11.3) 


If (11.1), (11.2), or (11.3) holds, then we say that strong causality holds at P. 
If the condition holds for all P € .@, then we say strong causality holds for 4. 


224 R. PENROSE 


The equivalence of (11.1)-(11.3) is not hard to establish and will be omitted 
here. Anexample, due to Hawking, for which the condition fails is indicated 
in Fig. 42. (With Minkowski coordinates and metric, we omit (x° > 1, 
x' < -1), (x° < -1, x' > 1), (x° = —x' =1) and (x° = —x! = —1); then 
identify (1, x!, x”, x°) with(—1, —x', x*, —x*), x’ < —1; Pis the origin.) 
If strong causality holds at P, we call a neighborhood 2 of P causally convex 
if no timelike curve meets 2 in a disconnected set [cf. (11.1)]. 


4A 
‘fabs. Y, 
\ Ne 2 
KR ae 
ee Pa, AY 
\ —, 
\‘e / r 


identify along ————>>——_ 


FIGURE 42. A future- and past-distinguishing space-time for which strong 
causality fails. 


If strong causality fails at P, an arbitrarily small perturbation of the 
metric in the neighborhood of P could result in the actual presence of closed 
timelike curves, which for various reasons 1s highly undesirable physically. 
Thus, it appears to be a very reasonable supposition that strong causality 
should actually hold at every point of space-time. Another condition, which 
is weaker than (11.1)-(11.3) but stronger than nonexistence of a closed time- 
like curve through P is future distinguishability [50] at P: 


I,(P) = 1,(Q) IMpPLigs P= Q (11.4) 


This states that timelike curves through P into the future, cannot leave and 
then return arbitrarily close to P. For the time-reverse of this, we say .@ is 
past distinguishing at P if and only if for all Q 


I_(P) = I_(Q) IMPLIEs P = Q (11.5) 


A hierarchy of causality conditions which are, on the other hand, stronger 


Structure of Space-Time 229 


than (11.1), have been suggested by Carter.°* Again, violation of any of 
Carter’s conditions could lead to closed timelike curves when ./ is perturbed 
slightly. 

The condition that strong causality should hold at every point of ./ 
is an interesting one. It is, in fact, equivalent to the statement that the 
‘Alexandrov topology” 7 * is Hausdorff [50], where the open sets of 7 * 
have as a base, the sets 1,(X) NA /_(Y). This is again equivalent to the 
statement that 7 * agrees with the ordinary manifold topology of ./. We 
can refer to Fig. 42 for a situation under which the condition fails. For this 
example, any open set in J *, containing P, meets any open set, in 7 * 
containing Q, so the Hansdorff property fails. 

Let us now State the theorems. 


Theorem IT, (Hawking) The following requirements on a space-time .@ are 
mutually inconsistent: 


(Ila) There exists a compact spacelike hypersurface (without 
boundary) #. 

(IIb) The divergence 0 of the unit normals to # is positive at every 
point of #. 

(IIc) R,,t%t? <0 for each timelike vector 1’. 

(IId) 4 is geodesically complete in past timelike directions. 


We can think of (Ila) and (IIb) as stating that ‘‘ the universe is (or was) 
spatially compact and expanding.” However, the “‘expansion“* as implied 
by (IIb) has to take place at every point of #. Thus, in the presence of a 
collapsed object, we could not expect to be able to arrange (IIb) for an 7 
‘“at the present time.’ If the condition is satisfied, therefore, .#” would 
probably have to refer to an early phase of the expansion. 

On the assumption of Einstein’s equations (7.1), requirement (IIc) is 
the “energy condition” (7.47), which (assuming, preferably, that 1 = 0) 
is very reasonable physically. 


Theorem III, (Hawking) The following requirements on a space-time ./ are 
mutually inconsistent: 


(IIIa) Strong causality holds at every point of J_(P), for some point 
Pe M. 

(IIIb) The divergence of all the timelike and null geodesics through P 
changes sign somewhere to the past of P. 


33 One such condition would be violated if there were two points P, Re .@, such that 
for any neighborhoods Y of Pand # of R, there would always be a timelike curve with past 
end-point in Y and future end-point in #, and another with future end-point in .7 and past 
end-point in .#. This behavior can occur without strong causality failing, as can be seen 
by taking -# as the twofold covering manifold of that of Fig. 42 (Carter). 


226 R. PENROSE 


(IIIc) R,,¢°t? <0 for each timelike vector 17. 
(IIId)  .# is geodesically complete in past timelike and null directions. 


Condition (IIIb) is a litthe awkwardly stated because the divergence in 
the case of the timelike geodesics is measured by the quantity @ of (7.40) 
while the divergence of the null geodesics is measured by p of (7.18), (7.33). 
The region at which p changes sign is not actually the limit of the region 
where @ changes sign. Another way of stating (IIIb) would be to refer only 
to the timelike geodesics and state that the (relevant) change of sign takes 
place within a compact region. 

I shall outline the proofs of Theorems II] and II] by indicating the main 
lemmas involved. (These lemmas and their proofs are all essentially due to 
Hawking [40, 42], although the discussion given here deviates slightly from his.) 


Lemma IV. \f a spacelike hypersurface # (without boundary) exists, which 
is a closed subset of ./ (that is, properly embedded), then a covering mani- 
fold .@* of ./ exists where the inverse image of # under the covering map 
consists of a set of discrete isometric copies of #, each of which is semi- 
spacelike in ./*. 


The idea is to make #, in (IIa), into a semispacelike set, by “‘ unwrapping ”’ 
.4@, but not so much as might render the new # noncompact. To achieve 
this, choose a point O in .@, then for each Pe .@ and for each equivalence 
class of curves from O to P whose intersection number with # is a given 
integer we assign a point of .7*. (This intersection number is homotopy 
invariant because # has no edges.) One readily verifies that .#@ * has the 
properties required of it in the lemma. [This construction effectively obtains 
.4@* from the universal covering manifold of ./, by putting an equivalence 
relation on the elements of the fundamental group of .@, saying two such 
elements are equivalent if they give equal intersection numbers with #7 
(see also, Geroch [34]).] 

Without loss of generality, we now regard .@* as .@ in Theorem II, 
so we can take # to be a semispacelike set. In order that we can treat 
Theorems II and III concurrently to a large extent, define, for Theorem III, 


H =I1_(P) (11.6) 


(Strong causality at P implies that # is nonvacuous, resembling a null cone 
at P.) Then, in both cases, we have # = H as an edgeless semispacelike set. 

It will be useful to have in the back of one’s mind a picture of some types 
of situation which might possibly arise. For Theorem II, we envisage a 
situation like that of Fig. 43. Neither H_(#) nor I_[#] can meet # since 
H is edgeless and a spacelike hypersurface. Every point of H_(#) lies on 
a future-inextendible null geodesic on H_(#) which may go to infinity or 
(since causality assumptions are not made here) simply spiral round and round 


Structure of Space-Time 227 


FIGURE 43. The type of situation to keep in mind for Theorem II. 


in some compact region [cf. Fig. 33]. For Theorem III, an awkward type 
of possible situation is illustrated in Fig. 44 for which H_(#) can intersect 
H. (It is the presence of such a possibility which requires the causality 
assumption (IIIa) to be made.) 


Lemma V. If # = # is edgeless and semispacelike, then int D_(# ) consists 
of all points X¥ e /_[#] for which the set J,(X) A [_[#] is both compact 
and contains no points at which strong causality fails. 


identify along ——>——_ 
FIGURE 44. An awkward possibility to bear in mind for Theorem III. 


228 R. PENROSE 


Write 2 for the set of such points X. [For J,(X), see (9.2).] Let 
X € & and consider a future-inextendible timelike curve with X as past end- 
point. If this did not meet #, it would, by the compactness of Y = J,(X) 9 
I _[#], have to have an accumulation point in Y. This is impossible since 
strong causality holdsinY. Thus,2 ¢ D_(#). Furthermore X ¢ H_(#), 
since otherwise, by (10.16), a future-inextendible null geodesic with past 
end-point XY would lie in Y, # being edgeless. (Such a null geodesic could 
not be contained in a compact set unless strong causality failed in the set.) 
Hence 2c int D_(#). For the converse, we recall, first, that if 
Zeint D_(#) then any future-inextendible nonspacelike curve through Z 
must meet #. Now strong causality cannot fail at any point Z € int D_(#). 
(Otherwise, in the critical case, we could use the “Q”’ of (11.2) to get ZO 
as a null geodesic along which strong causality fails and whose maximal 
extension would not meet #.) Suppose, on the other hand, that 2, = 
J (Zo) 0 I_[# ] fails to be compact for some Z,) € int D_(#). Cover 2, 
by a locally finite system of causally convex open sets Y%; which are small 
enough to ensure that any point U in %; is the center of a normal coordinate 
open ball @y containing @;. Let Z>eW%;,. Since By is noncompact it 
contains a sequence of points W; with no accumulation point in 2. A 
nonspacelike curve from Z, to W, exists and meets Zz, in V;, say. Let V 
be an accumulation point of V;.. The nonspacelike geodesic segment Z,)V 
meets W@;, in a unique point Z,e%;,. Then 2%, =J,(Z,) 0 I_[# ] contains 
a subsequence of the W; without an accumulation point in 2, and so also 
fails to be compact. Repeating the argument we obtain a sequence 
Zo <Z,<Z,< ... which lie on a nonspacelike curve y. From the locally 
finite nature of the Y%; system it follows that the Z; have no accumulation 
point, so y is a future-inextendible nonspacelike curve through Z, not meeting 
H. Thus Zp, ¢ int D_(#), showing that int D.(#) < &. This completes 
the proof of Lemma V. 

If 2 and & are subsets of .@ or points of .@, define d(2, @) to be the 
least upper bound of the lengths of all timelike curves with past end-point (in) 2 
and future end-point (in) 2. (Put d(2, 2) =0 if there are no such curves.) 
In some cases we shall have d(2, 2) = oo (but for bounded subsets of 
Minkowski space-time, for example, it is always finite: a timelike straight 
line connecting two Minkowskian points has the maximum length among all 
timelike curves connecting the points). 


Lemma VI. If # = # is edgeless and semispacelike, then d(X, # ) is finite, 
and attained for some geodesic y from X to #, whenever X € int D_(#). 


By Lemma V, we know that Y = J,(X) A I_(#) is compact, and can 
therefore be covered by a finite number of small open normal coordinate 
neighborhoods @;. Since strong causality holds in int D_(#), we can 


Structure of Space-Time 229 


arrange the boundaries of the @,’s to be causally convex. Write d; for the 
least upper bound of lengths of timelike curves in @;. Then from local 
arguments it follows d; is finite, and attained (by a geodesic, in fact). If y is 
a timelike curve from X to # then the length of y cannot exceed d= Xd;. 
Hence d(X, #) is finite. Furthermore, from the compactness, it will 
follow that the maximum of d(X, Y) for Ye #, will be attained, say, for 
Y=7 5; 

To construct y of maximal length from X to # [so that y actually is 
of length d(XY, Y)], choose a normal coordinate ball @, with center Y,, and 
choose Z = Z, to maximize d(X, Z) + d(Z, Yo), where Z € Y varies over the 
compact boundary # of @. Define y to be the geodesic with future end-point 
Y,. which contains Z, and extends into the past to a length d. Repeat the 
above argument, applying it to Z, in place of Y) and maximize d(X, V) + 
d(V, Z,). We get V,., which must lie on y (since a timelike curve with a 
corner can always be increased in length by smoothing the corner). Con- 
tinuing the process, we see that y must terminate with past end-point at_X, 
establishing Lemma VI. 


Lemma VII. If # is a spacelike hypersurface and y is a timelike geodesic 
segment from X¥ to # which maximizes d(X, #), then y meets W orthogon- 
ally and contains no point (other than possibly X) which is conjugate to #. 


The argument is not essentially different from that for a manifold with 
a positive definite metric (see Milnor [58]). A point conjugate to # on y 
is a focal point for the congruence I of null geodesics which meet # ortho- 
gonally [that is, a point at which the @ of (7.40) becomes infinite]. 

_ We can also apply Lemma VII in a limiting case, to the hypersurface # 
of (11.6) required for Theorem III. In this case, if y is a timelike geodesic 
segment from X to # which maximizes d(X, # ), then y, to be “ orthogonal” 
to #, would have to contain P. Every other point of # lies on a null 
geodesic on # which extends into the future to points more distant from Y. 
In this case, then, we interpret Lemma VII as saying that y passes through P 
and contains no point (other than possibly X) which is conjugate to P (that 1s, 
which is a focal point of the congruence IT of timelike or null geodesics 
through P). 

We are now in a position to apply these results to Theorems II and III. 
Let t° denote the unit (future-pointing) tangent vector field to the timelike 
y’s of T (in each case). By (IIb) or (IIIb), we have 6 > 0 at some point in 
I_[#] on each yeT. Thus by (IIc) or (IIIc) and the Raychaudhuri effect 
[cf. (7.40)-(7.47); also (7.39)], we have a focal point G(@ = 00; — p= 0) 
somewhere on each y ET, to the past of #. [This uses past-timelike com- 
pleteness (IId), (IIfd) and past-null completeness in (IIId).] If there is more 
than one such focal point, we choose the one nearest to # (and # P). Now 


230 R. PENROSE 


G will vary continuously with y. By the compactness of # (Theorem IT) 
or the fact (Theorem III) that the timelike and null directions at P constitute 
a compact system, we obtain the fact that the region Y swept out by the 
sements of y from G to # will be compact. 

Now, by Lemmas VI and VII we have the result that intD_(#)<¢ 
G, and thus D.(#)cGUH#. In particular, H(#H)cCGUHF. In 
Theorem II, # is spacelike, so H_(#) NX # = Q&. The Cauchy horizon 
H_(#), being a closed subset of the compact set Y, is now compact. By 
(10.15) and (9.7), H_(#) is thus a compact C° manifold without bound- 
ary. By (IId), we can extend each y into the past to length greater than 
max d(G, #). Then each y meets H_(#) by Lemmas VI and VII. Put 
F=yoH_(#) and define p(F) to be the maximum of lengths of the 
segments of y from Fto #. (Attained, for fixed F, because of the compact- 
ness of # and the [ system.) It is not hard to see that P(F) [with 
Fe H_(#)] attains its minimum value. Let F = Fy minimize p(F). Now, 
a null geodesic 7 on H_(#) has Fy as past end-point. Let F, be just to the 
future of Fj ony. The length of the jointed curve from Fy to #, which consists 
of the small segment F,F, of 7 together with a maximal y curve from F, to #, 
cannot be less than p(Fy). Such y exists since D.(#)<Y. Smoothing 
the corner, and moving the curve just away from H_(#), we get a timelike 
curve ¢ from Fy to # of length k greater than p(F,). Taking Bon ¢ very 
close to Fy, so d(B, # ) > p(Fo) and letting B approach Fo, we get a limiting 
position of the (relevant) y through B (from the compactness of #). This 
would give us p(Fo) > k > p(F,) which is the required contradiction establish- 
ing Theorem II. 

In Theorem III, # is not spacelike, so H_(#) might intersect # 
(Fig. 44). Thus, we cannot assume that H_(#) is compact and the above 
argument will breakdown. Instead, we bringin the strong causality. Define 
a relation < on H_(#) by 


U< V IF AND ONLY IF /_(U) cI_(V) (11.7) 


[that is; Ve I_(V)]. Then < is an order relation on H_(#) (weaker than 
<). Put ~W=GnH_(#). Then x is compact H_(#) being closed and 
Gcompact. Set =WAGCH(#H)—intGg — H. LetUcsx. Then 
a future-inextendible null geodesic « on H.(#) has past end-point U. By 
compactness and strong causality, « must leave x at U,€ #, say, and we 
shall have U< U,. But in the neighborhood of « — ox are timelike curves 
leading to P [sincea—- .%-< # =I _(P)]. These timelike curves must meet 
H_(#)—- # <M ontheir waytoP. Inthe limit as such curves approach «, 
this intersection point has an accumulation point V on so and we readily 
verify Uo < V (with Uy # V because of strong causality). Similarly V gives 
rise to Vo on H with V¢V,, and to W with Vo< WV, # W), etc. The 
sequence of distinct points Up)< Vo( Wo¢ ... on #& must have an accu- 


Structure of Space-Time 231 


mulation point, by the compactness of .#. We would have to have a break- 
down of strong causality at this point. This establishes Theorem III. 

One of the significant features of Theorem III is that it is not necessary 
to make assumptions apart from strong causality as to the nature of the 
universe as a whole. The theorem refers only to the past null cone of P and 
its interior, these being regions in principle observable by an observer at P. 
In time-reversed form the theorem applies to sufficiently uniform collapse 
situations (although not quite of the generality of those treated in Theorem I) 
and requires no information of a global cosmological nature. (We choose P 
at the center and assume the collapse is similar to a late stage of the 
Oppenheimer-Snyder situation.) Also, since no Cauchy hypersurface 
condition is needed, one is led to believe that even if the Cauchy horizon of 
Fig. 41 still exists in generic situations, this would probably not enable us to 
avoid singularities elsewhere. 

Of course, when applied to the past cone of P, we would not expect the 
divergence of the geodesics to change sign until a substantial portion of the 
entire universe has been encompassed. To estimate where this point should 
occur, one would need to know the density of material in the universe and 
knowledge of this is very uncertain at present. Also, much of the matter, 
for example, galaxies, is very irregularly distributed, so the total effect might 
be somewhat complicated. It ceases to be clear whether (IIIb) would be 
satisfied at all, by virtue merely of observed material objects. The idea of 
Hawking and Ellis [44], however, is to use the radiation density of the universe, 
whose present existence is in the form of a highly isotropic distribution with a 
spectrum apparently that of a blackbody at 3° absolute. They argue that 
this radiation density, owing to its uniformity, ought in /tse/f to be sufficient 
ultimately to ensure that (IIIb) is satisfied, except in the unlikely possibility 
of there being a large amount of ionized gas present. If this gas is present, 
then it should be sufficient to produce the desired focusing instead. 

The indications are, then, that if Einstein’s equations hold, then 
(curvature?) singularities** are a real feature of our universe. (Actually, 
Einstein’s equations are not very strongly used. For example, if we were to 
replace Einstein’s theory by the Brans—Dicke [13] theory, then the con- 
clusions of Theorems I, II, and III would be essentially unaffected.) More 
results are needed, however, if we are to get any feeling for the structure of 
these peculiar (presumably highly curved) regions of space-time. And really 
to ascertain the kind of physics that might be involved in such a region, we 
would need a much deeper understanding of the interrelation between matter, 
quantum theory, and space-time structure than we have at present. 


34 By a “singularity ’’ I mean a peculiar region, in which the local physics would be 
drastically affected, possibly to the extent that the smooth manifold picture of space-time 
would no longer be adequate (cf. discussion in Section 1). 


432 R. PENROSE 


Under normal circumstances, general relativity can, for practical 
purposes, remain remarkably apart—almost aloof—from the rest of physics. 
At a space-time singularity, the very reverse must surely be the case! 


REFERENCES 


1. Y. Aharonov and L. Susskind, Phys. Rev. 158, 1237 (1967). 
2. R. A. Alpher and R. C. Herman, Nature 162, 774 (1948). 
3. A. Avez, Compt. Rend. 254, 3984 (1962). 

4. A. Avez, Inst. Fourier 105 (1963). 

5. J. Bardeen, to be published. 

6. C. Behr, Zeit fur Astrophysik 60, 286 (1965). 

6a. P. G. Bergmann, Phys. Rev. Lett. 12, 139 (1964). 

7. B. Bertotti, Proc. Roy. Soc. (Lond.) A294, 195 (1966). 

8. H. Bondi, Nature, 186, 535 (1960). 


9. H. Bondi, in Lectures on General Relativity: 1964 Brandeis Summer 
Institute in Theoretical Physics, Vol. I (Prentice-Hall, Englewood Cliffs, 1965). 


10. H. Bondi, F. A. E. Pirani, and I. Robinson, Proc. Roy. Soc. (Lond.) A251, 
519 (1959). 

11. H. Bondi, M. G. J. van der Burg, and A. W. K. Metzner, Proc. Roy. Soc. 
(Lond.) A269, 21 (1962). 

12. R. H. Boyer and R. W. Lindquist, J. Math. Phys. 8, 265 (1967). 

13. C. Brans and R. H. Dicke, Phys. Rev. 124, 925 (1961). 

14. D. Brill, Annals of Phys. 7, 466 (1959). 

14a. D. Brill and S. Deser, Phys. Rev. Lett. 20, 75 (1968). 

15. H. W. Brinkmann, Nat. Acad. Sci. (U.S.) 9, 1 (1923). 

16. H. A. Buchdahl, Nuovo Cim. 10, 96 (1958); 25, 486 (1962). 

17. E. Calabi and L. Markus, Ann. Math. 75, 63 (1962). 

18. B. Carter, Phys. Rev. 141, 1242 (1966). 


19. B. Carter, Stationary Axi-symmetric Systems in General Relativity (Ph.D. 
Dissertation, Cambridge University, 1967). 


20. E. Cartan, Ann. Ec. Norm. Sup. 40, 325 (1923); 41, 1 (1924). 
21. S. Chandrasekhar, M. N. 95, 207 (1935). 

21a. S. Chandrasekhar, Proc. Nat. Acad. Sci. 57, (1967). 

22. G. F. Chew, Sci. Prog. 51, 529 (1963). 

23. R. Debever, Comptes Rend. 249, 1324, 1744 (1959). 

23a. V. de la Cruz and W. Israel, Nuovo Cim. 51A, 745, (1967). 
24. R. H. Dicke, Phys. Rev. 125, 2163 (1962). 


25. R. H. Dicke, P. J. E. Peebles, P. G. Roll, and D. T. Wilkinson, Ap. J. 142, 
414 (1965). 


Structure of Space-Time 233 


26. A. 


G. Doroshkevich, Ya. B. Zel’dovich and I. D. Novikov, Zhur. Eksp. 


Teor. Fiz. 49, 170 (1965); Engl. tr.: Sov. Phys.— JETP. 22, 122 (1966). 


27. A. 


S. Eddington, Nature 113, 192 (1924). 


28. J. Ehlers and W. Kundt, in Gravitation (ed. L. Witten; John Wiley and 
Sons, New York 1962). 


29. 
30. 
31. 
32. 
33. 
34. 


35. 
Dissertation, 


36. J. 


DARA YS > > 


. Einstein, Ann. Physik. 94, 769 (1916). 

. Einstein and J. Grommer, S. B. Preuss. Akad. Wiss. 2, (1927). 

. Einstein, L. Infeld, and B. Hoffmann, Ann. Math. 39, 65 (1938). 

. Finkelstein, Phys. Rev. 110, 965 (1958). 

. Gamow, Nature 162, 680 (1948). 

. P. Geroch, J. Math. Phys 8, (1967). 

. P. Geroch, Singularities in the Spacetime of General Relativity (Ph.D. 


Princeton University, 1967). 
N. Goldberg, in Gravitation (ed. L. Witten; John Wiley and Sons; 


New York, 1962). 
37. J. C. Graves and D. R. Brill, Phys. Rev. 120, 1507 (1960). 
38. Haefliger; see Topics on Space-Times by A. Lichnerowicz, Ch. V, this 


volume. 
39. B. 


K. Harrison, K. S. Thorne, M. Wakano, and J. A. Wheeler, Gravita- 


tion Theory and Gravitational Collapse (University of Chicago Press, Chicago, 1965). 


40. S. 
S. 
42. S. 
S. 
S. 


45. S. 


W. Hawking, Proc. Roy. Soc. (Lond.) A294, 511 (1966). 

W. Hawking, Proc. Roy. Soc. (Lond.) A295, 490 (1966). 

W. Hawking, Proc. Roy. Soc. (Lond.) A300, 187 (1967). 

W. Hawking and G. F. R. Ellis, Physics Letters, 17, 246 (1965). 

W. Hawking and G. F. R. Ellis, Ap. J. in press (1968). 

Helgason, Differential Geometry and Symmetric Spaces (Academic 


Press, New York, 1962). 


46. L. 
47. P. 


Infeld and B. L. van der Waerden, S. B. Preuss Akad. 9, 380 (1933). 
Jordan, J. Ehlers, and R. K. Sachs, Akad. Wiss., Mainz 1 (1961). 


47a. R. Kantowski, and R. K. Sachs, J. Math. Phys. 7, 443 (1966). 


48. R. 
49. A. 
50. E. 


P. Kerr, Phys. Rev. Lett. 11, 237 (1963). 
Komar, Phys. Rev. 104, 544 (1956). 
Kronheimer and R. Penrose, Proc. Camb. Phil. Soc. 63, 481 (1967). 


50a. M. D. Kruskal, Phys. Rev. 119, 1943 (1960). 


SI. N. 
52. W. 
53. E. 
54. R. 


H. Kuiper, Ann. of Math. 50, 916 (1949). 

Kundt, Z. Phys. 172, 488 (1963). 

M. Lifshitz and I. M. Khalatnikov, Advan. in Phys. 12, 185 (1963). 
W. Lindquist and J. A. Wheeler, Revs. Mod. Phys. 29, 432 (1957). 


54a. S. MacLane and G. Birkkoff, Algebra (Macmillan, New York, 1967). 


55. L. 
56. R. 


Markus, Ann. Math. 62, 411 (1955). 
F. Marzke and J. A. Wheeler, in Gravitation and Relativity (ed. 


H. Y. Chiu and W. F. Hoffmann; Benjamin, New York, 1964). 


234 R. PENROSE 


57. J. W. Milnor, Enseignement Math. (2) 9, 198 (1963). 

58. J. Milnor, Morse Theory (Annals of Math. Studies, Princeton University 
Press, Princeton, 1963). 

59. C. W. Misner, J. Math. Phys. 4, 924 (1963). 

60. C. W. Misner, Ap. J. in press (1968). 

61. C. W. Misner and A. H. Taub, to be published. 

62. C. Meller, The Theory of Relativity (Oxford Univ. Press, Oxford, 1952). 

63. E. T. Newman, L. Tamburino, and T. Unti, J. Math Phys. 4, 915 (1963). 

64. E. T. Newman and R. Penrose, J. Math. Phys. 3, 566 (1962); 4, 998 (1963). 

65. K. Nomizu, Lie Groups and Differential Geometry (Herald Printing Co. 
Ltd., Tokyo, 1956). 

66. G. Nordstrom, Ann. der Physik 42, 533 (1913). 

66a. I. V. Novikov, J E T P. Lett. 3, 142 (1966). 

67. J. R. Oppenheimer and G. Volkoff, Phys. Rev. 55, 374 (1939). 

68. J. R. Oppenheimer and H. Snyder, Phys. Rev. 56, 455 (1939). 

69. J. P. Ostriker, P. Bodenheimer and D. Lynden-Bell, Phys. Rev. Lett. 17, 
816 (1966). 

70. W. T. Payne, Am. J. Phys. 20, 253 (1952). 

71. R. Penrose, Proc. Camb. Phil. Soc. 55, 137 (1959). 

72. R. Penrose, Ann. of Phys. 10, 171 (1960). 


73. R. Penrose, in P. G. Bergmann’s Aeronautical Research Lab. Tech. 
Documentary Rept. 63-56: Quantization of Generally Covariant Fields (Office of 
Aerospace Research, U.S. Air Force, 1963). 


74. R. Penrose, in Relativity, Groups, and Topology (ed. C. M. DeWitt and 
B. S. DeWitt; Gordon and Breach, New York, 1964). 


75. R. Penrose, Phys. Rev. Letters 14, 57 (1965). 
76. R. Penrose, Proc. Roy. Soc. (Lond.) A284, 159 (1965). 
77. R. Penrose, Rev. Mod. Phys. 37, 215 (1965). 


78. R. Penrose, in Perspectives in Geometry and Relativity (Indiana University 
Press, Bloomington, 1966). 


79. R. Penrose, J. Math. Phys. 8, 345 (1967). 

80. R. Penrose, Intern. J. Theor. Phys. 1, (1968). 

80a. A. A. Penzias and R. W. Wilson, Ap. J. 142, 419 (1965). 

80b. A. Z. Petrov, Sci. Not., Kazan State University 114, 55 (1954). 
81. F. A. E. Pirani, Phys. Rev., 105, 1089 (1957). 


82. F. A. E. Pirani, in Lectures on General Relativity: 1964 Brandeis Summer 
Institute in Theoretical Physics Vol. I (Prentice-Hall, Englewood Cliffs, 1965). 


83. F. A. E. Pirani and A. Schild, in Perspectives in Geometry and Relativity 
(Indiana University Press, Bloomington, 1966). 


84. J. Plebanski, Acta Phys. Polon. 27, 361 (1965). 
85. R. V. Pound and G. A. Rebka, Phys. Rev. Letters 4, 337 (1960). 
86. A. K. Raychaudhuri, Phys. Rev. 98, 1123 (1955). 


Structure of Space-Time 235 


87. S. Refsdal, Mon. Not. R. Astr. Soc, 128, 295 (1964); 128, 307 (1964). 
88. W. Rindler, Mon. Not. Roy. Astr. Soc. 116, 6 (1956). 
89. I. Robinson, J. Math. Phys. 2, 290 (1961). 


90. I. Robinson, A. Schild, and E. L. Schucking, eds.: Quasi-Stellar Sources 
and Gravitational Collapse (University of Chicago Press, Chicago, 1965). 


91. H. S. Ruse, Proc. Lond. Math. Soc. 50, 75 (1948). 


92. H. Rudberg, The Compactification of a Lorentz Space (Thesis, University 
of Uppsala, Uppsala, Sweden, 1958). 


93. R. K. Sachs, Proc. Roy. Soc. (Lond.) A264, 309 (1961). 


94. R. K. Sachs, in Recent Developments in General Relativity (PWN/ 
Pergamon, Warsaw/New York, 1962). 


95. R. K. Sachs, Proc. Roy. Soc. (Lond) A270, 103 (1962). 
96. A. Schild, Am. J. Phys. 28, 778 (1960). 
96a. J. A. Schouten, Ricci-Calculus (Springer, Berlin, 1954). 


97. E. Schrodinger, Expanding Universes (Cambridge Univ. Press, Cambridge, 
1956). 


98. H. J. Seifert, Naturforsch. 22a, 1356 (1967). 
99. L. C. Shepley, Proc. Nat. Acad. Sci. 52, 1403 (1964). 
100. B. Spain, Tensor Calculus (Oliver and Boyd, London, 1960). 


101. N. Steenrod, The Topology of Fibre Bundles (Princeton Univ. Press, 
Princeton, 1951). 


102. K. Stellmacher, Math. Annalen 123, 34 (1951). 


103. J. L. Synge, Relativity: the General Theory (North-Holland, Amsterdam, 
1960). 


104. L. A. Tamburino and J. Winicour, Phys. Rev. 150, 1039 (1966). 
105. A. H. Taub, Ann. Math. 53, 472 (1951). 
106. J. Terrel, Phys. Rev. 116, 1041 (1959). 


107. K. S. Thorne, Relativistic Stellar Structure and Dynamics in 1966, 
Les Houches Lectures (ed. C. M. DeWitt, Gordon and Breach, New York, 1967). 


108. A. Trautman, in Lectures on General Relativity: 1964 Brandeis Summer 
Institute in Theoretical Physics, Vol. I (Prentice-Hall, Englewood Cliffs, 1965). 


109. A. Trautman, in Perspectives in Geometry and Relativity (Indiana Univer- 
sity Press, Bloomington, 1966). 


110. B. L. van der Waerden, Nachr. Ges. Wiss. Gottingen 100 (1929). 
111. H. Weyl, Nachr Gottingen. 99 (1921). 
112. J. A. Wheeler, Geometrodynamics (Academic Press, New York, 1962). 


113. J. A. Wheeler, in Relativity, Groups and Topology (ed. C. M. DeWitt and 
B. S. DeWitt; Gordon and Breach, New York, 1964). 


113a. H. Whitney, Bull. Amer. Math. Soc. 43, 785 (1937). 
114. E. T. Whittaker, Proc. Roy. Soc. (Lond.) A158, 38 (1937). 
115. L. Witten, Phys. Rev. 113, 357 (1959). 


Vill 


The Structure of Singularities’ 


ROBERT GEROCH 


From recent work of Hawking and others [1] it is known that singulari- 
ties occur in large classes of solutions of Einstein’s equations. Our intuitive 
idea of what a singularity should be in general relativity comes from the 
comparatively well-understood infinities which arise in other classical field 
theories, for example, in electrodynamics and hydrodynamics. However, 
general relativity differs from these theories in an important respect: whereas 
in the other field theories one has a background (Minkowskian) metric to 
which the field can be referred, in general relativity the ““ background metric”’ 
is the very field whose singularities we wish to describe. Consequently, it is 
not immediately clear what properties a space-time * should have in order that 
it be considered “‘ singular.” 

Consider electrodynamics. A solution of Maxwell’s equations is said 
to have a singularity if the electromagnetic field cannot be defined at some 
points. (For example, in the static Coulomb solution the field cannot be 
defined at the spatial origin). However, it turns out to be quite difficult to 
construct an analogous definition for general relativity because one can 
always remove such undesirable points from the manifold. The resulting 
space-time would then appear to be nonsingular. To define a singularity in 
general relativity, we begin by assuming that all of the ‘‘real singularities” 
have been removed from the space-time manifold, that is, that the metric is 
defined and differentiable everywhere. The singularities are then expected 
to show up as geodesic incompleteness of the space-time [2]. That is, we 
define a space-time as singular if it is geodesically incomplete.* 

The use of geodesic incompleteness as the defining characteristic of 
singularities leads, however, to a further problem. It is convenient (from 
experience with other theories) to think of singularities in terms of points, 


1 A more complete discussion of this work appears in J. Math. Phys., February 1968. 

2 By a ‘“‘space-time’’ we understand a 4-manifold carrying a C® metric of Lorentz 
(—, +, +, +) signature. 

3 One often restricts consideration to only the timelike or only the null geodesics. 


236 


The Structure of Singularities 237 


that is, to ask questions such as: ‘“‘ Does the mass density become infinite in 
the vicinity of the singularity?’’ This very question presupposes that there 
are some “singular points’ at which the singularity resides, and that we may 
ask questions about the neighborhoods of these points. But geodesic 
incompleteness does not at all provide us with the “singular points” or their 
‘“‘neighborhoods”’ that we should like to have in order to formulate physical 
questions. We shall outline an approach to bridging the gap between 
geodesic incompleteness on the one hand and the more physical notion of a 
singularity as a point on the other. We shall describe a construction by 
which, given any incomplete space-time, one may define the requisite singular 
points and describe their local properties.* Our problem is thus primarily 
one of formulating definitions and, where necessary, of choosing from among 
alternative definitions. 

Were the metric positive-definite, we could consider the completion of 
the space-time as a metric space—that is, the “singular points’? would be 
equivalence classes of Cauchy sequences. With our indefinite metric, this 
approach fails: instead, we consider equivalence classes of incomplete 
geodesics. Let M be the space-time, 7M>p the tangent space at the point P, 
and TM the tangent bundle. Define: 


H, =exp 'M 
H= |) (TMp,.04H,) 
PeM 
€éeTMp 
Ho = H- H,, 


where exp is the exponential map, 7Mp, ; is the one-dimensional subspace of 
TM p spanned by the vector ¢, and a bar denotes closure in TMp,;. Each 
point of Hy generates an incomplete geodesicin M. In particular, H, = 7M 
if and only if M is geodesically complete. 

Our plan is to form the singularity from equivalence classes of points of 
H,. Eachof H, H,, and H, is a subset of TM, and therefore has a topology 
induced on it from 7M. We define a new topology on H, as follows. Let 
O be any open subset of M. Define: 


S(O) = {a € H,| there exists an open neighborhood 
U of «a in H such that exp(U n H,) < 0} 


where ‘‘open neighborhood”’ refers to the induced topology on H. The 
subsets S(O) of H,, where O ranges over all open sets in M, form a basis for 
a topology on H,. Ifa, Be Hy, write a = B if every open neighborhood of 
a contains B and every open neighborhood of f£ contains « (“‘ open neighbor- 
hood” in the topology we have just defined on H,). The relation ~ is an 


‘¢ 


* A similar construction has been described by S. W. Hawking (Essay Submitted 
for the Adams Prize, December, 1966). 


238 ROBERT GEROCH 


equivalence relation. Let 0 denote the equivalence classes, and z the mapping 
H,— 0. The set 6 (with the quotient topology) is a 7, topological space 
which we call the g-boundary. The points of 6 are to represent the “ singular 
points”’ of the space-time. 

We next attach the g-boundary to the space-time M, thus allowing us 
to express the idea that a point of M is “close”’ to the singularity. Set 


M=Mvodo 


the disjoint union. We define a topology on M as follows: A subset 
(O, U)< M, where Oc M, Uc 40, is open in M if O is open in M, U is open 
in 0, and S(O) > U. M will be called the space-time with g-boundary. 

We may check with an example that our definitions of 0 and of M are 
reasonable. Let V be a geodesically complete space-time, and let S be a 
3-submanifold of V which divides V into two disjoint parts, Mand M’. The 
incomplete geodesics of M are precisely those geodesics which, when viewed 
in V, have an endpoint on S. We now ask for the g-boundary of M. It is 
easy to show that two incomplete geodesics in M are in the same equivalence 
class if and only if they have the same endpoint on §. Thus, in this case, the 
g-boundary of M is just the ordinary boundary, S, of Min V. The space- 
time with g-boundary is homeomorphic to the union M u S. 

In the above example, the boundary surface S has a differentiable and a 
metric structure. It might be asked whether or not these two structures can 
be determined by examining only M@. More generally, even when a space- 
time M cannot be extended it may be possible under certain conditions to 
define a differentiable and metric structure on its g-boundary. Such defini- 
tions can in fact be constructed. We use the natural differentiable and metric 
structure of 7M to induce these structures on Hy < TM, and then (by means 
of the map 2: H, > 0) onto 0.°. In the example considered earlier (in which 
M has a regular boundary surface S) the differentiable and metric structure 
of the g-boundary of M is just the induced differentiable structure and 
metric on S. 

We wish to apply these definitions to the singularities which arise in 
solutions of Einstein’s equations. The study of these singularities is made 
more difficult by the fact that it is not clear at the outset what questions one 
would like to have answered. Perhaps one of the central issues is: ‘‘ What 
modifications® of Einstein’s equations—these directed toward alleviating the 
problem of singularities—are suggested by the nature of the singularities which 
do occur?”’ Unfortunately, we have comparatively few exact solutions of 


> The precise definitions are too cumbersome to be described here. They may be 
found in the more complete discussion in J. Math Phys. (February 1968). 
© Under the term “‘ modifications,” let us also include quantization. 


The Structure of Singularities 239 


Einstein’s equations to investigate, and all of these possess some symmetry. 
To study the g-boundary of the general incomplete space-time (without impos- 
ing Einstein’s equations) is not a very promising line of attack because of the 
possibility that only singularities with a particular structure arise in solutions 
of Einstein’s equations [3]. Perhaps the most that we can hope for at present 
is to understand the singularity structure of those solutions which are avail- 
able. In Table | we list the g-boundaries of a number of exact solutions. 


TABLE 1 
The g-boundary 
Coordinates 
Attached to on the 
The solution Topology space-time at g-boundary 
Schwarzschild: 
ds? = —(1 — 2m/r) dt? 
+ (1 — 2m/r)-! dr? S2x R r=0 t,O,¢ 
+ r?(d0? + sin? 6 dd?) 
0<r<2m 
Reissner-Nordstrom: 
ds? = —(1 — 2m/r + e?/r?) dt? 
+ (1 — 2m/r + e?/r?)~! dr? S2x R r=0 t,9,¢ 
+ r?(d0? + sin? 6 dd?) 
m>e O<r<m-—(m?—-e?)'/2 
Friedmann: 
ds* = —dt? + R*(t){dx? =0 
+ sin? y(d6? + sin? 0 d?)) S3 and x9, 
R=a(l —cos 7) t=2na 
= a(n — sin 7) 
a= const. 
Kasner: 
ds* = —dt? + t7?1, dx? + 172 dy? 
+ t?P3 dz? R t=0 x,y,z 


Pi = —s/1+5+57) 

p2=S(1+s)/(1+s+4 87) 

p3=(14+ s)/(1+s+4+57) 
s = const. € (0, 1) 


There is one solution—Taub space [4]—which deserves special con- 
sideration. This is a geodesically incomplete, exact, source-free solution of 
Einstein’s equations. The underlying manifold is S* x R, with each S? 
spacelike. The spacelike sections may be chosen to be homogeneous, but not 
isotropic. (Each S° carries a preferred vector field which is a Hopf fibering.) 


240 ROBERT GEROCH 


It is known [5] that there are two different extensions’ of Taub space, these 
obtained by attaching a NUT space [6] through a null 3-sphere. 
The g-boundary of Taub space consists of the disjoint union 


O=AVUBVC 


where A and B are each an S° and C is an S*. Both A and B are Hopf- 
fibered by C with Hopf maps¢,:4A—C,and¢@,:B—>C. Abasisforthe open 
sets of C is as follows: the ordinary open sets of A and of B, and also sets of 
the form Eu $,'(E) vu ¢, ‘(E), where E is an ordinary open set of C. The 
g-boundary is not even a 7, topological space. The two extensions of Taub 
space may now be described as follows: In the first extension, the surface A 
becomes a null 3-sphere (the null geodesics in A being the circles of the Hopf 
fibering) through which Taub space is extended into NUT space. In the 
second extension, the roles of A and Barereversed. Thus, the two extensions 
of Taub space are characterized in a simple way in terms of the g-boundary. 
(Note also that we have a proof that locally Faub space has no more than 
two extensions.°®) 

If we ignore the sphere C and attach just A and B to Taub space we 
can then obtain a non-Hausdorff manifold with boundary. This suggests 
that we carry out both extensions of Taub space simultaneously. That 1s, 
we attach the 3-spheres A and B to Taub space by the g-boundary prescrip- 
tion, and then attach to each of A and B the appropriate (NUT space) 
extension. The result is a non-Hausdorff (but not otherwise pathological) 
space-time. Further, it appears that this extended space-time is physically 
reasonable. Each geodesic (other than the exceptional ones which strike the 
2-sphere C) which was incomplete in the original Taub space now passes 
through either sphere A or sphere B, and then into one or the other NUT 
space. An observer on a geodesic which passes through sphere 4A, for 
example, never enters the NUT space beyond sphere B, and so need not be 
concerned about that part of the space-time. No observer (at least if he is on 
a geodesic) would ever know that his space-time is non-Hausdorff.” 

Taub space is a nice example of the pathologies which can arise in 
incomplete space-times—and of the way these pathologies may be sorted 
out using the g-boundary. 


7 By an “‘extension’’ of a space-time M we mean a space-time M’ containing M as a 
proper subset. 

8 More precisely, if 7’ is a space-time containing Taub space 7 as a subset, then 
T<TUAVUB, where T is the closure of JT in T’. This appears to be the strongest 
uniqueness result one can reasonably expect. 

? It appears likely that this non-Hausdorff behavior of the extensions of Taub space 
is not stable, that is, that a small change in the Cauchy data for Taub space results in a new 
space-time in which extensions through A and B are not possible (R. Penrose and 
S. W. Hawking, private communications). What is not so clear is whether the non- 
Hausdorff character of the g-boundary is stable. 


The Structure of Singularities 24] 


Though we have given a prescription that succeeds in principle in 
defining the g-boundary of any space-time, in practice the construction may be 
quite difficult. In particular, one needs a considerable amount of informa- 
tion about the geodesics. In many of the known solutions of Einstein’s 
equations, integrals for the geodesics can be obtained up to quadratures, 
and these suffice for the construction. But the known solutions are highly 
symmetric: one cannot expect such a simplification to be available in general. 
Since only the “‘asymptotic properties’’ of the geodesics are relevant, one 
might hope to find a method to carry out the g-boundary construction with- 
out any detailed properties of the geodesics. Unfortunately, this hope has 
not yet been realized. We have seen, however, that some properties of the 
metric must be used to define the g-boundary. Since the geodesics are 
among the simplest such properties, it is not clear where to look for a different, 
more easily applied, construction which 1s still applicable to the generic, 
unsymmetrical space-time. 


REFERENCES 


1. R. Penrose, Phys. Rev. Lett. 14, 57 (1965); S. W. Hawking, Phys. Rev. Lett. 
17, 444 (1966), Proc. Roy. Soc. 294A, 511 (1966), 295A, 490 (1966), Proc. Roy. Soc. 
1967 (to appear); R. Geroch, Phys. Rev. Lett. 14, 445 (1966). 

2. C. W. Misner, J. Math. Phys. 4, 924 (1963); R. Geroch, Ann. Phys. (sub- 
mitted for publication). 
E. M. Lifshitz, I. M. Khalatnikov, Adv. Phys. 12, 185 (1963). 
A. H. Taub, Ann. Math. 53, 472 (1951). 
C. W. Misner, J. Math. Phys. 4, 924 (1963). 
E. T. 


3. 
4, 
5. 
6. Newman, L. Tamburino, T. Unti, J. Math. Phys. 4, 915 (1963). 


IX 


Superspace and the Nature of 
Quantum Geometrodynamics 


JOHN ARCHIBALD WHEELER 


Allowable History Selected out of Arena of Dynamics by Constructive 
Interference 242 

Superspace 246 

Electricity as Lines of Force Trapped in the Topology of Space 263 

The Energy of the Vacuum 269 


A Particle as a Geometrodynamical Exciton 270 
Problem |. ‘‘ Derivation ’’ of ‘‘ Einstein—Hamilton—Jacobi 
Equation”’ 273 


Problem 2. Structure of Superspace 278 
Problem 3. Initial Conditions 295 
References and Notes 295 


ALLOWABLE HISTORY SELECTED OUT OF ARENA OF 
DYNAMICS BY CONSTRUCTIVE INTERFERENCE 


Particle dynamics takes place in the arena of spacetime. Geometro- 
dynamics takes place in the arena of superspace. One needs only to mount 
to this point of view to have the whole content of Einstein’s theory spread out 
before his eyes, as in the outlook from a mountain peak, and as well quantum 
geometrodynamics as classical geometrodynamics. What is the nature of the 
landscape that we see from this height? What structures can we hope to 
build upon this landscape? And what kinds of mysteries are hidden in the 
mists beyond? 

In looking at the terrain of dynamics, happily everyone by now is long 
accustomed to the Hamilton-Jacobi theory. It belongs entirely to the world 


242 


Superspace and Quantum Geometrodynamics 243 


of classical physics. Yet it carries one in an instant into the world of the 
quantum of action. 

When better has one ever seen the transition from quantum to classical 
than in the theory of a particle in motion? Call the potential V = V(x). 
Call the energy of the particle E. Then there is not the slightest hope of dis- 
cussing the movement of the particle in space and time. Complementarity 
forbids! The wave function is spread out all over space. That one sees in no 
way more easily than through the semiclassical approximation for the prob- 
ability amplitude function, 


SLOWLY VARYING i 
Welx, 1) = feat ne — (-) pe!) ”) 


It is of no help in localizing the probability distribution that the Hamilton- 
Jacobi function S has in many applications a value large in comparison with 
the quantum of angular momentum h = 1.02 x 107?’ g-cm?/sec. It is of no 
help that this ‘“‘ dynamical phase’’—to give S another name—obeys the simple 
Hamilton-Jacobi law of propagation, 


os os 
ene) 


- (=) (=) + V(x) (2) 


And finally, it is of no help that the solution of this equation for a particle of 
energy E is extraordinarily simple, 


S(x, t) = —Et+ [ (amte — V(x)]}}'/? dx + 6; (3) 


The probability is still spread all over everywhere! There is not the slightest 
trace of anything like a localized world line, x = x(r)! 

How old the idea of building wave packets out of monofrequency 
waves—and how easy! The probability amplitude is now a superposition of 
terms, qualitatively of the form 


w(x, t) = Welx, t) + Wesacls, +: (4) 


Destructive interference takes place almost everywhere. The wave packet is 
concentrated in the region of constructive interference. There the phases of 
the various waves agree; thus 


Se(x, t) = Ses acs, t) (5) 


FIGURE 1. “Motion” and ‘‘world line’’ of a particle appear in quantum 
mechanics as the consequence of interference between wave trains 
that extend over all space. Above, potential energy as a function 
of distance for a model problem. 

Below, smooth lines numbered —20, —18, ..., 28 are wave 
crests of probability amplitude function (x, t) ~ (slowly varying 
amplitude factor) exp (i/h)S(x, t) for energy £. Dashed lines, same 
for energy E+ AE. Shaded area, region of constructive inter- 
ference (“‘wave packet’’). Black dots mark locus of classical world 
line (Se+az = Sz). 


244 


Superspace and Quantum Geometrodynamics 245 


At last a world line! And how easy to find the Newtonian motion from this 
condition of constructive interference: 


O = Sesaz— Se 
4 


0 = —tAE + [ Apg(x) dx + (prac - 5¢) 


{ 
x d 
t= [ oo +t) (Newton) (6) 


Here v,(x) denotes the velocity at the location x, 


A[2m(E — V)]'/? Apg 0(momentum) 1 TIME TO COVER : 
88 Dh = 
AE AE OE p(x) A UNIT DISTANCE o) 


and the quantity fo is an abbreviation for 


On+AE = OF ny dor 


AE. dE (8) 


Marvelously, not one trace of the quantum of action appears in the final 
solution for the motion. Yet the quantum principle supplies the whole 
rationale and motivation for talking about “‘ constructive interference.’’ The 
quantum comes in only when one recognizes the finite spread of the wave 
packet (Fig. 1). Then the idea of a world line has to be renounced. A whole 
range of histories contribute to the propagation of the particle from start to 
finish. This is the way the real world of quantum physics operates! 


Three Dimensions, Not Four 


Similarly in geometrodynamics: Here the dynamic object is not space- 
time. Itisspace. The geometrical configuration of space changes with time. 
But it is space, three-dimensional space, that does the changing. No sur- 
prise! In particle dynamics the dynamical object is not x and t, but only x. 
How to tell this to our friends in the world of mathematics? For so long they 
have heard us say that it was in default of the fourth dimension that Riemann 
could not have discovered general relativity. First there had to come special 
relativity and spacetime and the fourth dimension. Otherwise how could 
one have had any possibility to connect gravitation with the curvature of 
Spacetime (Fig. 2)? This understood, how can physicists change their minds 
and “‘take back’’ one dimension? The answer is simple. A decade and 
more of work by Dirac, Bergmann, Schild, Pirani, Anderson, Higgs, Arno- 
witt, Deser, Misner, DeWitt, and others has taught us through many a hard 
knock that Einstein’s geometrodynamics deals with the dynamics of geometry: 
of 3-geometry, not 4-geometry []I, 2]. 


246 JOHN ARCHIBALD WHEELER 


FIGURE 2. The track of the ball and the track of the photon through 
space (x, z plane) have very different curvatures, but in space time 
(x, z, ct Space) the curvatures are comparable. 


What is a 3-geometry? To simplify the question, rephrase it in an 
everyday context: What is a 2-geometry? Nothing illustrates a 2-geometry 
more clearly than an automobile fender. In whatever way coordinates are 
painted on its surface, in whatever way the points of that surface are named or 
renamed, the fender keeps the same 2-geometry. Similarly for a 3-geometry. 
In mathematical terms, a °’G is not a positive definite 3 x 3 metric; instead, 
it is an equivalence class of such metrics that are transformable, one into 
another, by diffeomorphisms. 

Mere baggage is the right term not only for the points of the space, but 
also for the coordinates employed to label these points, and for the metric that 
tells the distance from each point to all its near neighbors. Behind all of that 
paraphernalia lies the real idea, the concept of 3-geometry, as solid and sub- 
stantial as the 2-geometry of the fender. Down with “points”; up with 
‘* geometry’! 


SUPERSPACE 


One climbs up to the concept of geometry only to find a new height 
beyond—superspace [3]. Superspace is the manifold, a single point of which 
stands for an entire 3-geometry (Fig. 3). A 3-geometry stands at the halfway 
mark between point and superspace: risen though it is above the concept of 
point by abstractification, any one geometry of space counts as only a single 
point of superspace. 


Superspace and Quantum Geometrodynamics 247 


— S(G) 
---S'(G) 


WG) non eS'O/F, ves Gh 


FIGURE 3. Superspace ¥ is the manifold each of whose “‘points”’ A, B, C,... 
is an abbreviation for one 3-geometry. Asubmanifold H of F is the 
‘classical history of the geometry of space’? when space has been 
started off under some particular set of dynamical initial conditions. 
In other words, H consists of all those spacelike 3-geometries that can 
be obtained as spacelike sections through one particular 4-geometry 
(that satisfies Einstein’s classical field equations). The 3-geometries 
of the classical history H may alternatively but equivalently be 
distinguished from other 3-geometries by the fact that they satisfy 
the classical ‘‘condition of constructive interference” S((?G9) = 
S'(OFG) = §’"(G) = .-- (‘coincidence of wave crests” in lower 
magnified view of a region of superspace). In quantum theory the 
wave packet is not localized with unlimited sharpness. The prob- 
ability amplitude & has significant values for 3-geometries at some 
small “‘distance’”’ in superspace on each ‘“‘side”’ of the classical 
history H; hence the ‘‘quantum fluctuations in the geometry of 
space’”’ (depicted symbolically in more detail in Fig. 5). 


248 JOHN ARCHIBALD WHEELER 


Superspace is the arena for geometrodynamics, just as Lorentz- 
Minkowski space-time is the arena for particle dynamics (Table 1). The 
momentary configuration of the particle is an event, a single point in space- 
time. The momentary configuration of space is a 3-geometry, a single point 
in superspace. 


TABLE 1! Geometrodynamics Compared with Particle Dynamics 


Quality Particle Geometrodynamics 

Dynamical entity Particle Space 

Descriptors of momentary 

configuration x, t (“event’’) (>) ( 3-geometry ”’) 

History x= x(t) (4)@ ce 4-geometry ”) 

History is a stockpile of Yes. Every point on world Yes. Every spacelike 

configurations ? line gives a momentary slice through ‘* gives 
configuration of particle a momentary configura- 

tion of space 

Dynamic arena Spacetime (totality of all Superspace (totality of all 

points x, f) (3) Gs) 


Is superspace a proper manifold? Its construction is simple enough: 
call a 3-geometry a “‘ point,” and put all such points together. Or why limit 
attention to 3-geometries with positive definite metric? Build ‘‘ extended 
superspace’’; call every 3-geometry a “‘point’’ even if its signature is not 
+++, and put all these “‘ points” together. Does the resulting mathemat- 
ical object possess a reasonable topology? Isitatrue manifold? No! In 
a proper manifold each point has a neighborhood homeomorphic to an open 
set in a Banach space, and two distinct points have disjoint neighborhoods. 
Not so here. In an important investigation Michael Stern has shown [4] that 
for each point of extended superspace there exists another point, distinct from 
it, which nevertheless cannot be isolated from it by open sets. The topology 
is not Hausdorf. Extended superspace is not a manifold. It can hardly be 
regarded as an acceptable arena for dynamics. 

Superspace, in contrast, Stern has shown, does have Hausdorf topology 
and does constitute a manifold. According to these mathematical considera- 
tions, superspace is the proper arena for geometrodynamics. 

Physical considerations point to the same conclusion. To insist that a 
3-geometry shall have positive definite metric is to guarantee that no light ray 
can traverse this 3-geometry. No physical effect can propagate from one 
point of the ’ to another. A physical quantity local to the one point and a 
physical quantity local to the other have zero reciprocal coupling. They 
commute. Such quantities lend themselves to simultaneous specification over 


Superspace and Quantum Geometrodynamics 249 


the entire 3-geometry. No simpler example exists of a ‘‘complete observa- 
tion’’ in quantum geometrodynamics. Closely related is the initial value 
problem of classical geometrodynamics. 

‘‘Observation?’’ Does not observation imply an observer? If there 
is an observer, does he not respond to every event on his past light cone? 
And consequently does not that past light cone with its + + 0 metric supply 
the appropriate geometry on which to specify physical conditions? No, and 
for two reasons. First, to know the state of the geometry on a past light 
cone, however completely, is still to be powerless to predict from Einstein’s 
field equations the future of the geometry. Predict the 4-geometry within the 
past light cone? Yes [5]. Outside—to the future? No. Into the domain 
of the future, influences flow from afar without ever once impinging upon the 
light cone. ‘‘ Demidynamics’’—prediction into the past—one can do; full 
dynamics, no. The fault is in the choice of initial value hypersurface. The 
cone creates an unsymmetrical divide between past and future. How different 
from a spacelike initial value hypersurface! The initial value data on it allow 
a prediction of the complete geometrodynamical history, past and future. 

Second, the original argument was mistaken that one should consider 
‘*a light cone converging upon an observer.” The ‘‘ observer’’ of dynamical 
theory is not and cannot be a single detector of events either in special rela- 
tivity or in general relativity. Instead he ‘‘collects the printout’’ [6] from a 
multitude of detectors dotted densely about. Each of his detectors is 
sensitive only for an instant. These instants of sensitivity bear a spacelike 
relationship each to the other [7]. Collectively they define a spacelike hyper- 
surface, a ‘“‘simultaneity,”” a common moment of a rudimentary “‘time”’ 
variable that perhaps is not and certainly need not be specified any further. 
In its moment of sensitivity each detector responds to the appropriate influ- 
ence: to particle proximity in particle physics, to local field strength in electro- 
dynamics, and to local geometry in geometrodynamics. Whichever the 
dynamic entity, the measurements of it do not and cannot fully serve dynamic 
theory unless they span a spacelike hypersurface. 

Distill this discussion! Where specify dynamically complete initial 
value data? On a spacelike 3-geometry, yes; on a null 3-geometry, no. 
What “ points’’ belong to a topologically acceptable arena for geometrody- 
namics? Spacelike 3-geometries, yes; null3-geometries,no. 'Whataremark- 
able correspondence between dynamics and topology! 


66 


Wave Packet in Superspace and Its Propagation 


So much for the dynamical entity, space; and so much for the arena in 
which the dynamics takes place, superspace; now for the dynamics itself. 
Not one trace of dynamics does one see when he examines the typical proba- 
bility amplitude function, wy =y(©Y), for it is spread all over superspace. No 


250 JOHN ARCHIBALD WHEELER 


surprise! Already in classical theory the Hamilton-Jacobi function, S = 
S(G), is spread out over the manifold. Moreover, this ‘‘ dynamical phase 
function’’ of classical geometrodynamics gives at once in the semiclassical 
approximation the actual phase of y, according to the formula 


(3), _. [SLOWLY VARYING 1) Ga) 
ss ae pra ae aa (;) sae ©) 


—indication enough that w and S are both unlocalized! Dynamics first 
clearly becomes recognizable when sufficiently many such spread-out proba- 
bility amplitude functions are superposed to build up a localized wave 
packet [8, 9]: 


W= CW, + c,h, +°°: (10) 


Constructive interference occurs where the phases of the several 
individual waves agree [10]: 


SF) = S09) = + (11) 


The G's compatible with these conditions of constructive interference 
constitute the classical geometrodynamical history of space (Fig. 3): Marvel- 
ously, every 3-geometry that satisfies the conditions of constructive inter- 
ference (11) can be obtained as a spacelike slice through a certain 4-geometry. 
More marvelously, this ‘4 satisfies the ten field equations of Einstein. In 
other words, in addition to the principle of constructive interference one needs 
only the single equation of Hamilton and Jacobi for the ‘*‘ dynamical phase”’ 
S(°'Y) to obtain all of classical geometrodynamics. The proof of this impor- 
tant point has been announced by Gerlach [11]. 

The Hamilton-Jacobi equation itself was first given explicitly in the 
literature by Peres [12] on the foundation of earlier work by himself and 
others on the Hamiltonian formulation of geometrodynamics [13]: 


oS \ / 6S 
“l(g..g.,—14.. — | |[— R=0 Z 
g (Gin Gj 493; (5) (5) + (12) 


Here the g,; are the coefficients in the metric on “’G and g is the determinant 
of these metric coefficients. The quantity °R is the local value of the 
scalar curvature invariant of the geometry intrinsic to “’Y. The Hamilton- 
Jacobi function depends only upon the 3-geometry, and not upon how that 
3-geometry is expressed in terms of metric coefficients in a particular coor- 
dinate patch. However, S is treated in (12) as if it depended upon the metric 
coefficients individually, S = S(g,,,9:2,.--->933). With this understanding, 
6S/6g;; denotes the functional derivative of S with respect to alterations in the 
function g;(x, y, z). The fact that coordinates in the end have nothing to do 


Superspace and Quantum Geometrodynamics 251 


with the matter can be stressed by rewriting (12) symbolically in the form 


(13) 


This ‘‘ Einstein-Hamilton—Jacobi equation’”’ contains all of classical geomet- 
rodynamics in regions where no “‘real”’ sources of mass-energy are present. 
This one equation carries the entire content of Einstein’s ten field equations. 
Consequently, this equation has been checked and is subject to further check, 
in the same sense and to the same degree that the predictions of Einstein’s 
field equations have been verified, and are subject to further verification. In 
the simplest version of particle physics a world line has a simple meaning: It 1s 
a stockpile consisting of all those points (x, rf) that satisfy the condition of 
constructive interference. There are 00’ of these points on the world line, if 
we use an obvious though loose way of counting. Dynamics marks out these 
points and gives preference to them over all the oo” points in the arena of 
particle dynamics. In geometrodynamics a 4-geometry has a similar signi- 
ficance. It is a stockpile consisting of all those 3-geometries that satisfy the 
condition of constructive interference. There are co” of these 3-geometries, 
obtainable by making a spacelike slice through the 4-geometry in one or 
another way: 

t = t(x, y, Z) (14) 
thus, 


(a) 00° points (x, y, Z) 
(b) 00’ choices for ¢ at each of these points 


hence 
(c) 00°” choices of 3-geometry altogether (15) 


Classical geometrodynamics marks out these °?4’s and gives preference. to 
them over the infinitely more numerous totality of 9’s to be found in the 


entire arena of geometrodynamics. That arena, superspace, contains “G's 
to the number (00*)®”, calculated most simply as follows: 


(a) 00° points; and, in a coordinate system that makes the metric dia- 
gonal, 
(b) 3 diagonal components of the metric specifiable per space point; 


and hence 
(c) 00° choices of the metric per space point; 
and therefore 


(d) (00%) choices of 'G altogether (16) 


252 JOHN ARCHIBALD WHEELER 


From this totality of conceivable 3-geometries one has to pick out and exhibit 
all the dynamically allowed 3-geometries before he has told in all fullness how 
space evolves with time. No new lesson! One has long known that time in 
general relativity is a many-fingered entity. The hypersurface drawn through 
spacetime to give one ‘°’G can be pushed forward in time a little here or a 
little there or a little somewhere else to give one or another or another new 
(9G ‘* Time’’ conceived in these terms means nothing more or less than the 
location of the °'G in the “GY. In this sense ‘‘3-geometry is a carrier of 
information about time”’ [14]. 


‘*Spacetime,’’ a Concept of Limited Validity 


The child’s toy can be removed from its box only to reveal another box 
and—that taken away—another box, and so on, until eventually there are 
dozens of boxes scattered over the floor. Or conversely the boxes can be put 
back together, nested one inside the other, to reconstitute the original package. 
The packaging of ‘G's into a “’G is much more sophisticated. Nature 
provides no monotonic ordering of the °’Y’s. Two of the dynamically 
allowed ‘°’4’s taken at random will often cross each other one or more times. 
When one shakes the ‘*’¢ apart, he therefore gets enormously more “G's 
‘“spread out over the floor’’ than he might otherwise have imagined. Con- 
versely, when one puts back together all of the °°9’s allowed by the condition 
of constructive interference, he gets a structure with a rigidity that he might 
not otherwise have foreseen. This rigidity arises from the infinitely rich 
interleaving and intercrossing of clear-cut well-defined °’Y’s one with another. 
In summary, (1) the ‘°’Y’s allowed by (11) are the basic building blocks; (2) 
their interconnections give “’G its existence, its dimensionality [15], and its 
‘‘ magic structure”’; and (3) in this structure every ‘°°’ has a rigidly fixed loca- 
tion of its own. 

How different from the textbook concept of spacetime! There the 
geometry of spacetime is conceived as constructed out of elementary objects, 
or points, known as “‘events.’’ Here, by contrast, the primary concept is 
3-geometry, and the event is secondary: (1) The event lies at the “‘ intersection ”’ 
of such and such ‘’4’s. (2) Its timelike relation to some other °°@ is deter- 
mined by the structure of the ‘*’Y, which in turn derives from the intercros- 
sings of all the other ‘4's. 

Whether one starts with ‘°’’s as primary and regards the “‘event”’ as a 
derived concept, or vice versa, might make little difference if one were to 
remain in the domain of classical geometrodynamics. It makes all the differ- 
ence when one turns to quantum geometrodynamics. 

There is no such thing as a 4-geometry in quantum geometrodynamics, 
and for a simple reason. No probability amplitude function w(Y) can 
propagate through superspace as an indefinitely sharp wave packet. It 


Superspace and Quantum Geometrodynamics 253 


spreads (Fig. 3). It has a finite probability amplitude in a domain of super- 
space of finite measure. This domain encompasses a set of ‘°’4’s far too 
numerous to accommodate in any one *’Y. One can express this situation in 
various terms. One can say that propagation takes place in superspace, not 
by following any one classical history of space, not by following any one ‘4, 
but by summation of contributions from an infinite variety of such histories. 
This extension of Feynman’s concept of “‘sum over histories’’ has received 
special attention from Misner [16]. In whatever way one states the matter, 
however, the facts areclear. The “’Y’s that occur with significant probability 
amplitude do not fit and cannot be fitted into any single ‘Y. That ‘‘ magic 
structure’’ of classical geometrodynamics simply does not exist. Without 
that building plan to organize the ‘°)4’s of significance into a definite relation- 
ship, one to another, even the “*‘ time ordering of events’ is a notion devoid of 
all meaning. 

These considerations reveal that the concepts of spacetime and time 
itself are not primary but secondary ideas in the structure of physical theory. 
These concepts are valid in the classical approximation. However, they have 
neither meaning nor application under circumstances when quantum- 
geometrodynamical effects become important. Then one has to forgo that 
view of nature in which every event, past, present, or future, occupies its 
preordained position in a grand catalog called ‘‘spacetime.’’ There is no 
spacetime, there is no time, there is no before, there is no after. The question 
what happens “next”’ is without meaning. 


The Planck Length and Gravitational Collapse 


Under everyday circumstances these unexpected consequences of the 
quantum principle never come into evidence. The characteristic dimension 
of quantum geometrodynamics is the Planck length [17], (AG/c?)'/? = 
1.6 x 10°°° cm. By comparison the normally relevant scale of any 
geometry of interest is stupendous. Negligible on the scale of the geometry 
is the quantum-mechanical spread of the wave packet in superspace. Con- 
Sequently the dynamical evolution of the geometry can be treated in the 
context of classical geometrodynamics. Thus the geometries that occur 
with significant probability amplitude can be idealized to a good approxima- 
tion as if confined to a region in superspace of zero thickness. The ‘Y's 
of this limited set are sufficiently small in number to fit together into a single 
‘G_ They are sufficiently large in number to reproduce every conceivable 
spacelike slice of that ‘“Y. In this approximation it makes good sense to 
speak of “‘the classical geometrodynamical history of space.” 

If the dynamics of geometry thus normally lends itself to classical 
analysis, there are two contexts where it does not. One is the final stage of 


254 JOHN ARCHIBALD WHEELER 


gravitational collapse. The other is analysis of the microscopic quantum- 
mechanical fluctuations in the geometry of space and their consequences for 
physics generally. 

Of all the applications of quantum geometrodynamics, none would seem 
more immediate than gravitational collapse [18]. Here according to classical 
general relativity the dimensions of the collapsing system in a finite proper 
time are driven down to indefinitely small values. The phenomenon Is not 
limited to the space occupied by matter. It occurs also in the space surround- 
ing the matter. In a finite proper time the calculated curvature rises to 
infinity. At this point classical theory becomes incapable of further predic- 
tion. In actuality, classical considerations go wrong before this point. A 
prediction that is infinity is not a prediction. The wave packet in superspace 
does not and cannot follow the classical history when the geometry becomes 
smaller in scale than the quantum-mechanical spread of the wave packet. 
Not a new phenomenon! Throughout physics one sees examples of a wave 
impinging upon a region of interaction of dimensions small compared to a 
wavelength. The outcome is scattering or diffraction. One speaks of a 
probability for this, that, or the other outcome of the interaction. The 
photon or phonon or other entity entering from one direction emerges in 
another. The concept of a deterministic world line may serve adequately 
during the phase of approach to the zone of interaction, and during the phase 
of regression. It is completely out of place during the phase of scattering. 
So here. The concept of a deterministic history of geometry, a well-defined 
(\G, makes sense in the early phase of gravitational collapse, but has abso- 
lutely no application in the decisive phase. There “‘space-time”’ is nonexist- 
ent, ‘““events’’ and the “‘time ordering of events”’ are without meaning, and 
the question *‘ what happens after the final phase of gravitational collapse’’ 
is a mistaken way of speaking. 

The correct way of speaking deals with the propagation of the prob- 
ability amplitude y(°’Y) in superspace. The semiclassical treatment of the 
propagation (Eqs. (9), (13)) is appropriate in most of the domain of super- 
space of interest for gravitational collapse. Not so in the decisive region. 
There, as in elementary problems of scattering, the mathematical analysis has 
to go straight back to the full and accurate wave equation for its foundation. 
What is the appropriate question to ask of the mathematics? One has 
learned how to formulate the right question, in the case of scattering, through 
long experience with the physics. However, if one had not had that physical 
experience, he would have learned out of the mathematical formalism itself 
to speak about incoming plane waves and outgoing spherical waves, and 
scattering amplitudes, and about all the other relevant concepts. Similarly 
in quantum geometrodynamics, where one has so much less experience. The 
mathematical formalism itself must serve as the final arbiter on how to pose 
the central question about gravitational collapse as well as how to answer it! 


Superspace and Quantum Geometrodynamics 255 


Quantum Fluctuations in Geometry of Space 


‘Quantum fluctuations in the geometry of space”’ is the other pressing 
field of application of quantum geometrodynamics. What a strange com- 
bination of words! Fluctuations are well known. The term ‘‘ quantum 
fluctuations’’ carries a deeper meaning. It stands for a movement that can 
never be frozen out, however low the temperature. Such fluctuations are 
universal. In the hydrogen molecule both the separation of the two atoms 
and their relative momentum continually flucuate. Fixity of both would 
violate the uncertainty principle. In the frozen vaccuum of quantum electro- 
dynamics the electric and magnetic fields both fluctuate. Were both of these 
dynamically conjugate field variables to vanish, the uncertainty principle 
would likewise fail. The same is true of quantum geometrodynamics. There 
the conjugate variables are the “intrinsic curvature”’ of three-dimensional 
space and the “‘extrinsic curvature,”’ telling how this space is bent relative to 
the geometry of any 4-dimensional space-time that might envelop it. For 
both dynamic quantities to be stilled would equally contradict Heisenberg’s 
uncertainty relation. Thus all space at the quantum scale of distances is the 
seat of the liveliest geometrodynamics, as it is also everywhere the scene of the 
most violent small-scale fluctuations in the electromagnetic field. 

No prediction of quantum electrodynamics has been more impressively 
verified in the whole post-World War II era than these vacuum fluctuations in 
the electric field. Their perturbing influence on the motion of the electron 
(Fig. 4) accounts for the major component in the Lamb shift in the energy 
levels of the hydrogen atom [19]. 

In putting numbers to the fluctuations in geometry one is guided by the 
example of electromagnetism and, at a still earlier stage, by the example of the 
harmonic oscillator. For the oscillator the expression for the energy is 


—— ? 
_ ( fein ¥ oe aoe mae 
energy energy 2m 


h? d? 
= [uta] Sy + 4me?x? | yo) dex (17) 
For a probability amplitude function W(x) of Gaussian form and range a, 
2 
Wx) = 27 4q7 12 exp (- >) (18) 
2a 
the expectation value of the energy is 
2 
= —, +}4mw’a’ (19) 


4ma 


256 JOHN ARCHIBALD WHEELER 


FIGURE 4. Symbolic representation of motion of electron in hydrogen atom 
as affected by fluctuations in electric field in vacuum (‘“‘vacu- 
um” or ‘ground state” or ‘“‘zero-point’’ fluctuations). The 
electric field associated with the fluctuation, E,(t) = E,(w)e~ for dw, 
brings about in the most elementary approximation the displacement 
Ax = J (e/mw?)E,(w)e~ dw. The average vanishes but the root 
mean square <(Ax)?> does not. In consequence the electron feels 
an effective atomic potential altered from the expected value 
V(x, y, z) by the amount 

AV (x, y, Z) = $<(Ax)?> V7 V(x, y, z) 
The average of this perturbation over the unperturbed motion 
accounts for the major part of the observed Lamb—Rutherford shift 
AE = <AV(x, y, z)> in the energy level. Conversely, the observa- 
tion of the expected shift makes the reality of the vacuum fluctuations 
inescapably evident. 


If quantum effects were absent, the first term would disappear. The mini- 
mum energy would be obtained by putting the oscillator at rest at the origin. 
However, in the real world of quantum physics such a sharp localization in 
position (a = 0) would make the effective wavelength zero and the momentum 
and kinetic energy arbitrarily large (divergent first term in Eq. (19)). The 


Superspace and Quantum Geometrodynamics 257 


minimum energy, E£ = thaw (“half quantum”’ or “‘ zero-point energy’”’ or 
‘‘ fluctuation energy”’) is obtained for a range a = (h/mw)'/? (“range of zero- 
point oscillations’’). In other words, the oscillator in its ground state 


ror=(C)"an[-(Be}e] 


can be said to “‘ resonate”’ between locations in space ranging over a region of 
extent ~(h/mw)'/. 

The electromagnetic field can be treated as an infinite collection of 
independent ‘“‘field oscillators,” with amplitudes ¢,,€,,.... When the 
Maxwell field is in its state of lowest energy, the probability amplitude for the 
first oscillator to have amplitude €,, and simultaneously the second oscillator 
to have amplitude ¢,, the third €,, and so on, is the product of functions of 
the form (20), one for each oscillator. When the scale of amplitudes for each 
oscillator is suitably normalized, the resulting infinite product takes the form 


W(E1,€2,...) = N exp[—(€,7 + €.7 +-°)] (21) 


This expression gives the probability amplitude w for a configuration 
B(x, y, z) of the magnetic field that is described by the Fourier coefficients 
€1,6,.... | Onecan forgo any mention of these Fourier coefficients if he so 
desires, however, and rewrite (21) directly in terms of the magnetic field con- 
figuration itself [20]: 

B(x,) * B(x2) 


W(B(x, y, z)) =NV exp (- { 16n°her?, 


No longer does one speak of ‘“‘the’’ magnetic field; he talks instead of the 
probability of this, that, or the other configuration of the magnetic field and 
this even under circumstances, as here, where the electromagnetic field is in its 
ground state. 

It is reasonable enough under these circumstances that the configuration 
of greatest probability is B(x, y, z) =0. Consider for comparison a configu- 
ration where the magnetic field is again everywhere zero except in a region of 
dimension L. There let the field, subject as always to the condition div B = 0, 
be of the order of magnitude AB. The probability amplitude for this con- 
figuration will be reduced relative to the nil configuration by a factor exp (—/). 
Here the quantity J in the exponent is of the order (AB)? L*/he. Configura- 
tions for which J is large compared to | occur with negligible probability. 
Configurations for which J is small compared to | occur with practically the 
same probability as the nil configuration. In this sense, one can say that the 
fluctuations in the magnetic field in a region of extension L are of the order of 
magnitude 


“pe d?x, dx, (22) 


1/2 
AB ~ uo 


(23) 


258 JOHN ARCHIBALD WHEELER 


In other words, the field “‘ resonates’’ between one configuration and another 
with the range of configurations of significance given by (23): Moreover, the 
smaller is the region of space under consideration, the larger are the field 
magnitudes that occur with appreciable probability. 

Still another familiar way of speaking about electromagnetic field 
fluctuations gives additional insight relevant to geometrodynamics. One 
considers a measuring device responsive in comparable measure to the mag- 
netic field at all points in a region of dimension L. One asks for the effect on 
this device of electromagnetic disturbances of various wavelengths. A dis- 
turbance of wavelength short compared to L will cause forces to act one way 
in some parts of the detector and will give rise to nearly compensating forces 
in other parts of it. In contrast, a disturbance of a long wavelength 4 pro- 
duces forces everywhere in the same direction, but of a magnitude too low to 
have much effect. Thus the field, estimated from the equation 


WAVE OF WAVELENGTH A IN A 


DOMAIN OF VOLUME A? 


ENERGY OF ELECTROMAGNETIC 
( OF WAVELENGTH A 


ENERGY OF ONE otra 


or 
he 
Baye Ss 
A 
or 
(he)'!? 
Bw rE (24) 


is very small if A is large compared to the domain size L. The biggest effect 
is caused by a disturbance of wavelength 1 comparable to L itself. This line 
of reasoning leads directly from (24) to the standard fluctuation formula (23). 


Fluctuations Superposed on Classical Background 


Nothing says that the electromagnetic field has to be in its ground state. 
It can be excited by a distant wireless antenna so that locally it is oscillating up 
and down at 10° cycles/sec in the range B = + 3.3 x 10~® gauss, for example 
(accompanying electric fields | millivolt/meter). By comparison with this 
deterministic classical history of the local magnetic field, 


B, = 3.3 x 1078 gauss cos 10° (25) 


the quantum-mechanical fluctuation field of the same frequency (Eq. (24)) is 
quite negligible: 
6 x 107 ?(erg cm)!/? 

(3 x 10*cm)? 


~ 107!” gauss (26) 


BA 


Superspace and Quantum Geometrodynamics 259 


Even the total effect of all the independent quantum fluctuations in the mag- 
netic field is small as sensed by a detector of dimension L ~ 1 cm; thus, from 
Eq. (23), 


6 x 107 ?(erg cm)!/? 


AB 
(1 cm)? 


~6 x 107° gauss (27) 


However, when attention is fixed on the field in a still smaller domain, say 
L ~ 0.1 cm or less, then the quantum fluctuations in the magnetic field domin- 
ate over the deterministic classical field of (25): 


AB > 6 x 1077 gauss (28) 


So much for the coexistence of classical fields and quantum fluctuations in 
electrodynamics! | 

Similar considerations apply in geometrodynamics [21]. Quantum 
fluctuations in the geometry are superposed on and coexist with the large- 
scale slowly varying curvature predicted by classical deterministic general 
relativity. Thus, in a region of dimension L, where in a local Lorentz frame 
the normal values of the metric coefficients will be —1, 1, 1, 1, there will occur 
fluctuations in these coefficients of the order 


* 
Ag~— 29 
I~T (29) 
fluctuations in the first derivatives of the g;,’s of the order 


AT ~—~ > (30) 


and fluctuations in the curvature of space of the order 


Ag L* 
AR ~R~B (31) 
Here 
hG\'/2 
Lt = (=) = 1.6 x 10°°> cm (32) 


is the so-called [22] Planck length. It is appropriate to look at orders of 
magnitude. The curvature of space within and near the earth, according to 


260 JOHN ARCHIBALD WHEELER 


classical Einstein theory, is of the order 
G 
Rw~ (a)0 ~ (0.7 x 10778 cm/g)(5 g/cm?) 
c 


~4x 10°78 cm? (33) 


This quantity has a very direct physical significance. It measures the ** tide- 
producing component of the gravitational field’’ as sensed, for example, in a 
freely falling elevator or in a space ship in free orbit around the earth [23]. 
By comparison the quantum fluctuations in the curvature of space are only 


AR ~ 10733 cm~? (34) 


even in a domain of observation as small as | cm in extent. Thus the quan- 
tum fluctuations in the geometry of space are completely negligible under 
everyday circumstances. 
Even in atomic and nuclear physics the fluctuations in the metric, 
10°?? cm 


Ag. ————— = 10° 
§~ 10-® cm 


and 


10°73 cm 


Ag ~ —-—__ 
I~ 10-" cm 


~ 10° ?° (35) 
are so small that it is completely in order to idealize the physics as taking place 
in a flat Lorentzian spacetime manifold. 

The quantum fluctuations in the geometry are nevertheless inescapable, 
if we are to believe the quantum principle and Einstein’s theory. They 
coexist with the geometrodynamical development predicted by classical 
general relativity. The fluctuations widen the narrow swathe cut through 
superspace by the classical history of the geometry. In other words, the 
geometry is not deterministic, even though it looks so at the everyday scale of 
observation. Instead, at a submicroscopic scale it “‘ resonates’’ between one 
configuration and another and another. This terminology means no more 
and no less than the following: (1) Each configuration °°’ has its own prob- 
ability amplitude y = y(°’G). (2) These probability amplitudes have com- 
parable magnitudes for a whole range of 3-geometries included within the 
limits (29) on either side of the classical swathe through superspace. (3) This 
range of 3-geometries is far too variegated on the submicroscopic scale to fit 
into any one 4-geometry, or any one classical geometrodynamical history. 
(4) Only when one overlooks these small-scale fluctuations (~10~ °° cm) and 
examines the larger-scale features of the 3-geometries do they appear to fit 
into a single space-time manifold, such as comports with the classical field 
equations. 


Superspace and Quantum Geometrodynamics 261 


Extrapolate Geometrodynamics to the Planck Scale of 
Distances ? 


Is it not preposterous to apply existing theory in a realm of dimensions 
smaller than nuclear sizes by twenty powers of ten? What a fantastic extra- 
polation! Yet it is the tradition of theoretical physics to adopt what might be 
called a ‘‘strong bargaining posture.”’ It is not the custom to give up any 
long-established principle without pushing it to the limit and finding out 
where, if anywhere, it goes wrong. A direct contradiction between a predic- 
tion and an observation, or between two points of principle, is ordinarily 
necessary if one is to have any solid ground for change, or even any indication 
where to make a change! The physicist does not have the habit of giving up 
something unless he gets something better in return. 

To pursue systematically the consequences of quantum geometro- 
dynamics is recommended not merely by the absence of any contradiction and 
by the absence of anymorecomprehensivetheory. Itisalsomade attractive by 
an example out of the past. Who in the 1850’s, measuring the attraction 
between electric charges and testing the Coulomb lawat distances from meters 
to millimeters, could have predicted that it would be proved valid in 1911 to 
10-12 cm, in 1933 to 107!% cm, and in 1963 to 10°!* cm? The fantastic 
extrapolatory power of basic physical theory always seems a miracle [24]! 

Electrodynamics contains no natural length. Neither do general 
relativity (G, c) or the quantum principle (h) individually. The union of 
geometrodynamics and the quantum does: L* = (hG/c*)!/?.. The importance 
of what is essentially this length was stressed by Planck as early as 1899 [25]. He 
had taken up the study of blackbody radiation not least because it 1s universal: 
independent of the shape and size of the container, independent of the proper- 
ties of its walls, and independent of the complexities of atomic, molecular, and 
solid state physics. In keeping with this search for the universal, he asked for 
standards of length, mass, and time that are independent of such special cir- 
cumstances as the size of the planet we happen to inhabit, its period of rota- 
tion, and the density of the fluid that covers it. In excluding reference to 
special substances he found it natural also to exclude reference to special par- 
ticles: both the electron, with its then known mass and charge, and all heavier 
entities—objects that even today pose unsolved structural problems. Exclud- 
ing all else, he was left with the speed of light, the Newtonian constant of 
gravitation, and the constant newly discovered from the analysis of blackbody 
radiation as the quantities he was willing to accept as truly fundamental. Out 
of these three quantities there is but one way to construct a length. Planck’s 
length, introduced to science before either special or general relativity, first 
acquired an understandable role in the context of quantum geometrodynamics 
as measure of the fluctuations in the geometry of space. 


Accept seriously a length as small as 107°? cm? Try to assess in any 
detail at all the physics that goes on at a scale of distances shorter by twenty 
powers of ten than the 107 '* cm of elementary particle physics? What could 
be more preposterous? Only three numbers are still more preposterous than 
107°: the factor of 10*° that distinguishes electric forces from gravitational 
forces; the 10*° from elementary particle dimensions to the estimated radius 
of the universe at the phase of maximum expansion; and the 10°° that furnishes 
an order of magnitude estimate as good as any that one knows for the number 
of particles in the universe. Eddington [26], Dirac [27], Jordan [28], 
Dicke [29], and Hayakawa [30] argue that it is unreasonable to think of such 
enormous numbers as having independent roles in physics. The corres- 
pondence between these numbers cannot be purely accidental, they stress [31]; 
there could hardly be a “ regularity of the large numbers”’ if there were not a 
deep connection between cosmology, general relativity, and elementary par- 
ticle physics. But where to begin in looking for this connection? Begin with 
asking why there are so and so many particles in the universe? Hardly. 
Physics can elucidate laws of motion, but it has proved powerless to explain 
initial conditions. Begin with asking why the universe has such and such 
dimensions? Again a matter of initial conditions, outside of the present 
scope of physics. Begin with trying to explain the charge structure of elemen- 
tary particles, or the characteristic dimension of 107-'? cm? More hopeful, 
perhaps, but still beyond present power. No, of all the quantities coupled by 
the large numbers, one alone has a clear status within existing theory: the 
Planck length. Where else than here can one begin? 

It could seem plausible to stop considering physics a little below 
10~** cm because accelerator budgets stop a little above $100 million a year. 
What good is it, one sometimes asks, to analyze what goes on if one has no 
way to observe it? Happily, experience has taught views less biased by man- 
kind’s temporary limitations. Wait until one had mastered the work 
hardening of metals before one took up the microscope and saw dislocations ? 
Wait until one had explained dislocations before one started the study of 
atoms? Notso. The route to understanding did not go down the ladder, 
1 cm — 107* cm > 1078 cm, but up: 1078 cm > 10°* cm—1 cm. One had 
to understand something about atoms before he could explain dislocations, 
and something about dislocations before he could uncover the rationale of 
work hardening. Is it possible that one similarly must have some perspective 
on what happens at 107 °° cm before one can find the rationale of particles and 
10~'? cm [32]? Right or wrong [33], quantum geometrodynamics alone has 
any suggestions to offer on this point. What then does it say? 

Every new perspective offered by this long established theory radiates 
out from the central prediction: geometry fluctuates violently at small dis- 
tances. This concept opens new views on the nature of electric charge, on the 
nature of the vacuum, and on the nature of particles. 


Superspace and Quantum Geometrodynamics 263 


ELECTRICITY AS LINES OF FORCE TRAPPED IN THE 
TOPOLOGY OF SPACE 


To arrive at a new vision of electricity it is enough to question an old 
view of topology: “‘ Space is Euclidean in character at small distances.’ The 
view is reasonable enough for everyday purposes. Equally reasonable is the 
conception of the surface of the ocean as endowed with Euclidean topology— 
reasonable to one flying miles above it. To one in a small boat the opposite 
impression is inescapable. He sees the breaking waves and the foam. He 
knows that the surface is multiply connected at the scale of millimeters and 
centimeters. If the ocean is violent, the geometry of space on the Planck 
scale of distances is even more violent. Nowhere is there any region of calm. 
Moreover, if the equations of hydrodynamics are nonlinear, so are the 
equations of geometrodynamics. What a contrast to the linearity of electro- 
dynamics! There the predicted fluctuations in potential 


f; 1/2 
Mee (36) 
and in field 
f; 1/2 
Mpa < (37) 


preserve always the same character, regardless of the smallness of the distances 
L to which one goes in his probing. There is no natural magnitude to mark 
off large fluctuations as different in nature from small ones. The contrary is 
the case for fluctuations in the metric, governed by the formula 


Ag ~ — (38) 


Values of Ag comparable to unity and larger indicate changes in geometry so 
drastic that the word ‘“‘curved space’”’ is hardly adequate to describe them. 
‘‘ Changes in topology’’ seems a more reasonable description. 

It is not so natural in mathematics as in physics to consider a transfor- 
mation that alters one topology to another. An oscillating drop of water 
undergoes fission. The topology changes. A point marks the place of 
Separation of the two masses of liquid. That point lacks the full neighbor- 
hood of points that characterizes a normal point. Such a critical point is 
ruled out from any proper manifold by the very definition of the term ** mani- 
fold’’in mathematics. Before the division, the surface of the drop constituted 
a manifold. After the division, it is again a manifold, consisting of two dis- 
parate pieces. At the instant of division it is not a manifold. But little 


264 JOHN ARCHIBALD WHEELER 


attention does the drop pay to this distinction. It divides, despite all defini- 
tions. No more reason does one see in the definition of ** manifold”’ against 
space changing its topology. 

No principle is at hand that would give one topology perpetual pref- 
erence over all others. On the contrary, the field equations of relativity are 
purely local in character. They make no statements at all about global 
topology, as Einstein himself emphasized more than once. Moreover, the 
whole character of physics speaks for the theme that ‘everything that can 
happen will happen.’ An alpha particle penetrates through a region classic- 
ally forbidden to it; the side group on a chain molecule undergoes ‘‘ hindered 
rotation’’; and the umbrella structure of an ammonia molecule turns inside 
out despite the apparent contradiction to the law of conservation of energy. 
It is difficult to resist the conclusion that likewise the topology of space can 
change and does change. 

If these general considerations are relevant, and if fluctuations alter the 
topology of space as well as its curvature [34], the consequences are decisive 
for the nature of the physics that goes on at small distances, and even for the 
nature of superspace itself. Superspace has to be broadened from the totality 
of positive definite 3-geometries built on one topology to the totality of posi- 
tive definite 3-geometries built on the totality of all topologies. It has new 
implications to say that the probability amplitude y(“Y) is appreciable for a 
swathe of points in superspace, with a finite spread about the deterministic 
history of classical geometrodynamics (Fig. 5): Geometry in the small fluctu- 
ates not only from one microscopic pattern of curvature to another, but much 
more, from one microscopic topology to another. Moreover, those struc- 
tures that are everywhere full of submicroscopic “ handles”’ or ‘‘ wormholes”’ 
are overwhelmingly more numerous than 3-geometries of simpler topology. 
In other words, space “‘resonates’’ between one foamlike structure and 
another [35]. The space of quantum geometrodynamics can be compared to 
a carpet of foam spread overaslowly undulating landscape. The undulations 
symbolize deterministic classical geometrodynamics. The continual micro- 
scopic changes in the carpet of foam as new bubbles appear and old ones dis- 
appear symbolize the quantum fluctuations in the geometry. The fluctuations 
change the microscopic connectivity of space itself. No longer is one 
entitled to take it for granted that space is Euclidean in the small. 

Nowhere in physics does the structure of geometry in the small play 
a larger part than inelectricity. Electric lines of force converge onto a region 
of space and none come out of it. Something strange must go on in that 
region. Either Maxwell’s equations break down, or the region is filled with a 
special substance, an electric jelly, a magic fluid beyond further explanation. 
From the one picture or the other there has never been anescape. The reason 
is simple. The region is tacitly assumed to have Euclidean topology. Give 
up this assumption [36], but hold to Maxwell’s field equations for empty 


Superspace and Quantum Geometrodynamics 265 


space. Then the conclusion changes. The region in question must contain 
the mouth of at least one ** handle” or ‘* wormhole” (Fig. 6). Electric lines of 
force converge upon this mouth only to emerge from the other mouth, 
located somewhere else in space. One comes in this way to a new picture of 
electricity: A classical geometrodynamical electric charge is a set of lines of 
force trapped in the topology of space. 


Mone sa w 3 so. .* 

os ¢ O.0- ¢ / Be ak “s s 
— s-¢ 3.8% & cr @ 
Oe whe ew 
PO Oy ew wets 
* wa ° _* 

“s et tn 
nr eee 

J i ® 


om t @. 
‘e.* “e2 , 
SLs 


A 


ee 
4 va 
a 
at pee tee 
eae an * 
ee 
+ AF.» 


~ 


Ree 
eos se Ga ais 
a oe care - 

of re 


FIGURE 5. Symbolic representations of three alternative probability functions 
WG). Above, %o, normal fluctuations alone; middle, 4, macro- 
scopic classical gravitational wave plus superposed fluctuations; 
below, Ws, localized excitation plus superposed fluctuations. The 
complex number yw gives the probability amplitude for the occur- 
rence of the 3-geometry ‘"°G. Those 3-geometries that contribute 
most to the totalized probability are highly multiply connected at 
the scale of the Planck length (‘‘foamlike structure of space in the 
small’’). 


266 JOHN ARCHIBALD WHEELER 


Someone can be imagined who first studies topology, next takes to 
heart Maxwell’s equations for empty space, and then for the first time sees an 
electric charge. He takes it as experimental evidence that space must be 
multiply connected in the small (Table 2). Nothing prevents us from adop- 
ting the same point of view. On this view, the occurrence of electric charges 
in nature is the single most impressive piece of evidence available today for the 
reality of the fluctuations that quantum theory predicts in the geometry of space, 
and suggests in the topology of space, at the Planck scale of distances. 

The ‘‘wormholes”’ predicted by quantum geometrodynamics are a 
property of all space, are submicroscopic, and they and the fluxes through 


FIGURE 6. Classical geometrodynamical concept of charge as “‘electric lines of 
force trapped in the topology of space.’’ The “‘wormhole”’ connects 
two regions in one otherwise nearly Eucildean space, not two 
different Euclidean spaces. The distance between the two mouths 
of the wormhole (1) via the nearly Euclidean space and (2) via the 
route through the wormhole are permitted by Einstein’s field 
equations to be quite different, even in order of magnitude, contrary 
to what one might assume from the figure. There the geometry 
(one dimension suppressed!) is depicted for simplicity as if embedded 
in flat Euclidean 3-space. The third dimension (distance ‘off ’’ of 
the geometry) is to be considered as unattainable as the stratosphere 
is unattainable to an ant crawling on the surface of the earth. An 
observer endowed with an instrument of inadequate resolving power 
sees one wormhole mouth as a positive charge, the other as a negative 
charge. To becontrasted with this classical picture of single indenti- 
fiable wormholes in the quantum-mechanical picture (Fig. 4) of a 
submicroscopic foamlike wormhole structure constantly fluctuating 
throughout all space. 


Superspace and Quantum Geometrodynamics 267 


them arise spontaneously, through quantum fluctuations. Nothing prevents 
one from considering also a single wormhole, of macroscopic dimensions, 
created ab initio, with a prescribed flux threading through it, and evolving 
deterministically in time in accordance with the classical field equations. 
However, this classical electric charge has not the slightest direct connection 
with the charges of the real world of quantum physics and requires no con- 


sideration here. 


TABLE 2 The Concepts of Electric Charge in Quantum and Classical 
Geometrodynamics, Compared and Contrasted 


Concept 


This type of electric charge 
interpretable as electric 
lines of force trapped in the 
topology of space ? 


Any “‘real’’ charge con- 
sidered to be present ? 


Nature of topological trap ? 


Where are wormholes 
located ? 


Dimension of wormhole 


Status of wormhole? 


Classical 


Yes 


No 
Wormhole in 3-geometry 


Connecting opposite 
charges 


Enormous compared to 
10-33 cm 


Classical deterministic 
development with time; 
macroscopic size; this size 
determined by initial con- 
ditions; no observational 
evidence that any such 
macroscopic wormhole 
ever occurred, although by 
definition it is one of the 
solutions of Einstein’s 
field equations 


Quantum 


No 
Wormhole in 3-geometry 


Throughout all space 


Comparable to 107-74.cm 


**Fluctuation’’; conse- 
quence of fact geometry is 
not deterministic; that is, a 
consequence of the non- 
zero probability that exists 
for quite diverse 3-geom- 
etries ; 3-geometries with 
wormholes almost every- 
where (scale of 107 33 cm) 
are the most numerous 
(‘‘foamlike 3-geometry’’); 
wormholes do not have to 
be initiated ; they occur 
naturally (“zero-point 
disturbance of the 

vacuum ’’) and cannot be 
avoided 


268 


TABLE 2—(Continued) 


Concept 


Wormhole pinches off ? 


Charge as identified by 
flux through wormhole? 


Relation of charge onan 
elementary particle to this 
kind of -geometro- 
dynamical charge? 


JOHN ARCHIBALD WHEELER 


Classical 


Yes. Throat of radius a 
undergoes gravitational 
collapse in time of order 
a/c (107 '° sec for 

a -3cm) 


Does not change with time; 
a constant of the motion; 
not quantized 


Not the slightest direct 
connection with classical 
geometrodynamical 
electric charge 


Quantum 


Question strictly speaking 
is undefined; no meaning 
to deterministic small- 
scale geomctrodynamiics in 
context of quantum geom- 
etrodynamics; but in the 
loose way of speaking that 
is so useful in certain parts 
of quantum field theory one 
can say that “ virtual 
wormholes”’ are contin- 
ually being created and 
annihilated as ‘“*space res- 
onates between one foam- 
like 3-geomctry and 
another” 


Flux through any one 
wormhole, like shape and 
dimensions of wormhole, 
subject to quantum- 
mechanical fluctuations; 
order of magnitude of this 
‘fluctuation charge, ”’ 
(fc)'/? ~ 12e; not 
‘*“constant’’ and not 

** quantized” 


Not the slightest direct 
connection with the charge 
on an elementary “*fluc- 
tuation wormhole.” 
Particle viewed as a quan- 
tum state of collective 
excitation of the entire 
geometrical continuum 

(“‘ geometrodynamical 
exciton’’). Charge asso- 
ciated with this exciton to 
be calculated when one 
understands how to do the 
calculation! 


Superspace and Quantum Geometrodynamics 269 


THE ENERGY OF THE VACUUM 


If one insight out of fluctuations in geometry has to do with the nature 
of electricity, a second has to do with the energy of the vacuum. Already in 
quantum electrodynamics one has long known from both observation and 
theory that when one examines a region of the vacuum of dimension L 
(1) the fluctuation energy is found to be of the order fic/L, and (2) the effective 
density of energy of the fluctuation is found to be of the order fic/L*. Not 
known out of electrodynamics is any natural lower limit for the L values that 
come into consideration. In quantum geometrodynamics formulas of the 
same type hold, but there is now a natural cutoff: the Planck length. It is un- 
reasonable to apply linear theory when L is of the order of L* = (hG/c?)'/? = 
1.6 x 10° 3° or less. In other words, there is a certain sense in which one 
can say: (1) Elementary fluctuations measure in energy up to fic/L*. The 
magnitude of this characteristic mass-energy is ~10~° g or ~ 1078 eV; that is, 
about twenty powers of ten greater than the mass of an elementary particle 
and nine powers of ten greater than the energy of the most energetic cosmic 
ray that has ever been found. (2) These fluctuations take place throughout 
allspace. (3) The density of the electromagnetic energy associated with these 
fluctuations [37] is of the characteristic order of magnitude fic/L* = c?/hG? ~ 
10°° g/cm*, stupendous in comparison with the 10'* g/cm? of nuclear 
matter. (4) The effective density of the ‘‘ gravitational’’ or geometrodynam- 
ical wave energy [38] associated with these fluctuations is of the same order 
of magnitude. 

Every observation shows that the net density of energy in space ts neglig- 
ible by comparison with these huge figures. The enormous positive energy 
must be compensated in some way. How the compensation comes about was 
a major concern of Niels Bohr over the years. A new approach to the prob- 
lem shows up when one considers the fluctuations at the Planck scale of dis- 
tances. Two such fluctuations, each of mass-energy ~(fic/G)'/? ~ 107° g, 
interacting gravitationally at the distance L*, have a coupling energy that is 


negative, 
he\ 1/272 
Gm,m a| (=) | he 
grav rio Lt L* ( ) 


and of the order —10~° g. This coupling between neighboring ‘‘ fluctuons ”’ 
thus has such a sign and such a magnitude as to be appropriate for compensat- 
ing the energies of the individual fluctuons. It would seem surprising if this 
mechanism did not play a dominant part in bringing about compensation of 
vacuum energies. 


270 JOHN ARCHIBALD WHEELER 


Despite all the mysteries that enshroud the compensation process, one 
conclusion stands out clear: Individually the components of the vacuum energy 
are enormous, and collectively they compensate. 


A PARTICLE AS A GEOMETRODYNAMICAL EXCITON 


Quantum fluctuations in geometry offer not only a new picture of elec- 
tricity and a new view of the violence of the vacuum, but also a third vista: 
the concept of a particle as a quantum state of excitation of the geometry 
of space. 

A bit of nuclear matter with its density of ~ 10'* g/cm? is completely 
unimportant compared to the calculated density of ~ 10°° g/cm? of the fluctu- 
ation energy of the vacuum. A particle means less to the physics of the 
vacuum than a cloud (10~ © g/cm?) means to the physics of the sky (107 ° g/cm?). 
No single fact points more powerfully than this to the conclusion that a 
‘particle’ is not the right starting point for the description of nature. 

From the standpoint of geometrodynamics the primordial entity 1s 
not one particle, nor an intercoupled family of particle fields, but the geometry 
of empty space itself. On this view a particle is not itself a 10~ °°-cm fluctu- 
ation in the geometry; instead, it is a fantastically weak alteration in the 
pattern of these fluctuations, extending over a zone containing very many such 
10-3 regions. In brief, a particle is a quantum state of excitation of the 
geometry; it 1s a geometrodynamical exciton. In mathematical terms, the 
vacuum is described by one probability amplitude , = ,('°'Y): and states 
where one or more particles are present are described by other functionals 
y= W(O9). 

On this interpretation elementary particle physics ranks as a new and 
beautiful kind of chemistry. First came the chemistry of atoms and mole- 
cules, marvelous in its complexities and also in its regularities, all built on one 
single simple dynamical entity, the electron. Then came“ nuclear chemistry,’ 
with nuclear shapes and energies and reaction rates all going back for their 
explanation to the dynamics of another elementary dynamical entity, the 
nucleon. Today we deal in effect with a chemistry of the elementary particles 
themselves. Wonderful advances in the subject classify the particles into 
families, tie their masses together into mathematical regularities, and system- 
atize their rates of transformation—without, however, revealing the identity 
of the elementary dynamical entity beneath it all. That entity, on the present 
view, 1S geometry itself. 

When in the first half of the nineteenth century Berzelius proposed that 
chemical forces are a manifestation of electrical forces [39], he excited investi- 
gations by many workers that eventually discredited his hypothesis. The 
homopolar bond: how could the observed affinity be reconciled with the 


Superspace and Quantum Geometrodynamics 271 


known repulsion between like electric charges? Homopolar forces, ionic 
forces, van der Waals forces, valence forces: how could all this variety of 
magnitudes and particularities possibly be compatible with electrical forces, 
pure and simple? The tide against the electrical interpretation of chemical 
forces only turned with the discovery of the electron by J. J. Thomson in 1897. 
The dynamical entity once identified, the unraveling of the mystery eventually 
had to follow. Still it was not easy for the imagination to grasp what organiz- 
ing power the quantum principle possesses. In encounters in the mid 1920’s 
more than one physicist told his colleague from the laboratory across the way, 
‘* Your chemistry is now passé. All that jumble can now beexplained in terms 
of electrons and quantum numbers.”’ In more than one case the then justified 
reply came back, ‘** What makes you think your circular and elliptic orbits have 
anything to do with chemistry? Have you ever heard of the valence angles of 
ammonia or the tetrahedral bonds of carbon? Don’t ever forget that electri- 
cal forces are electrical forces and chemical forces are chemical forces.”” The 
Coulomb law had to be supplemented by the concept of probability amplitude 
before Heitler and London could explain valence forces. Today no one 
doubts that the Schroedinger equation accounts in principle for all of chemis- 
try. Yet no surer way could be found to stop the advance of chemistry than 
to require everyone to calculate the wave function of his new compound 
before making it. Not the contemplation of 600-dimensional configuration 
space, but the analysis of the regularities between molecule and molecule, 
proves to be the fruitful way to make progress. It can hardly be otherwise 
when the energy of binding is the very small difference between the very much 
larger total energies of the associated and dissociated states. 

In ‘‘elementary particle chemistry’’ the art of analyzing regularities is 
already highly advanced [40], thanks not least to applications of group theory 
even more far-reaching than the applications made in the chemistry of mole- 
cules. On the other hand, the possibility to derive particle masses from first 
principles would seem even more remote, from the standpoint of geometro- 
dynamics, than the possibility to calculate the binding energy of a complex 
molecule from first principles. The energy with which one hopes to end up, 
10-27 g to 107 7% g, is smaller by twenty powers of ten than the characteristic 
energy of the theory with which one starts. Still it would seem unwise to dis- 
count in advance the ingenuity of available methods to calculate reliably small 
effects against enormously larger backgrounds, as for example in the case of 
superconductivity [41] (~10~* eV versus ~ 10 eV). 

How can the geometrodynamical interpretation of particles (Table 3) be 
tested? Sooner to be expected than quantitative calculations are qualitative 
predictions and conceptual developments. Of such developments none gives 
more incentive than gravitational collapse [18] to believe ina tie between parti- 
cles and geometry; and none gives more encouragement to believe in the rele- 
vance of the Planck length than the concept of charge as electric lines of force 
trapped in the topology of space. 


272 JOHN ARCHIBALD WHEELER 


TABLE 3 Quantum—Geometrodynamical Interpretation of 
Particles and Forces 


Ultimate dynamical object Not an electron or any other kind of particle, but geometry 
itself 
Geometry of space Not unique and classical, but everywhere resonating at the 


scale of the Planck length between configurations of varied 
submicroscopic curvature and varied topology 


Topology of space Those geometries that occur with appreciable probability 
amplitude are highly multiply connected throughout all 
space (‘‘foamlike structure’’) 

Particle Not a foreign and physical entity moving about within the 

geometry of space, but a quantum state of excitation of that 
geometry itself; as unimportant for the physics of the vacu- 
um as a cloud is unimportant for the physics of the sky. 
Nota localized ripple in the geometry, not asubmicroscopic 
‘“wormhole”’ in the geometry of space at the Planck scale 
of distances, but an excitonlike change in the phase relations 
in the probability amplitude for a very large number of such 
wormholes 


Charge Nota place where Maxwell’s equations fail, not a mysterious 
** foreign and physical ”’ jelly introduced into geometry from 
outside, but “lines of electric force trapped in the top- 
ology of space”’ (Table 2 and Problem 2) 


Spin Not a dynamical object added to geometry but the non- 
classical two-valuedness associated with geometry itself— 
because distinct probability amplitudes attach to a multiply 
connected 3-geometry endowed with alternative triad fields 
or “‘spin structure’’ (Problem 2) 

Force Strong forces, weak forces, and intermediate forces no 
more distinct in their character than van der Waals forces, 
ionic forces, and valence forces; not themselves primordial, 
but the residual effect of percentage-wise negligible changes 
in the enormous density of the energy of the zero-point 
fluctuations taking place in the geometry of space at small 
distances 


Widening vistas open out for further investigation [42]. (1) What can 
One give in the way of simple principles to shortcircuit all the usual derivations 
of Einstein’s field equations and pass in one leap from postulates to the 
Hamilton—Jacobi equation itself? (2) What deeper insights can one win into 
the structure of superspace? And (3) at what new point in geometrodynam- 
ics does one draw that old line between dynamic law and initial conditions 
that one sees throughout all of physics? 

These issues are the foothills. The mountain looms above them: Is a 
particle a state of excitation of the geometry of space? 


Superspace and Quantum Geometrodynamics 273 


Einstein, above his work and writing, held a long-term vision: There is 
nothing in the world except curved empty space [43]. Geometry bent one 
way here describes gravitation. Rippled another way somewhere else it 
manifests all the qualities of an electromagnetic wave. Excited at still another 
place, the magic material that is space shows itself as a particle. There is 
nothing that is foreign and ** physical’’ immersed in space. Everything that 
is, is constructed out of geometry. Thisis the dream. Is this dream coming 
to life? 


Note: This article is the revised and extended version of reports made 
at the Académie Internationale de Philosophie des Sciences, Oberwolfach/ 
Freiburg im Breisgau, July, 1966; the International School of Nonlinear Math- 
ematics and Physics, Munich, July, 1966; the Society of Engineering Science, 
Raleigh, North Carolina, October, 1966; the Colloque Internationale sur 
Fluides et Champ Gravitationel en Relativité Générale, Paris, June, 1967; and 
the Battelle Rencontres in Mathematics and Physics, Seattle, July, 1967. It 
appears in the proceedings of the second, third, and fifth organizations in 
English and in the proceedings of the fourth in French. Appreciation is ex- 
pressed to the organizers of these conferences for their hospitality and to many 
colleagues for discussions and advice, among them especially Y. Choquet, 
Bryce DeWitt, Cecile DeWitt, L. Ehrenpreis, U. Gerlach, H. Leutwyler, C. W. 
Misner, and M. Stern. 


PROBLEM lI. “DERIVATION” OF 
‘“ EINSTEIN-HAMILTON-JACOBI EQUATION” 


If one did not know the Einstein-Hamilton—Jacobi (EHJ) equation 
(Eq. (12) or (13)), how might one hope to derive it straight off from plausible 
first principles, without ever going through the formulation of the Einstein 
field equations themselves? To find one such direct derivation of the EHJ 
equation would not seem rash to hope for when one already knows five ways 
to derive the field equations in their traditional form: (1) Einstein’s original 
derivation based upon the correspondence with Newtonian gravitational 
theory; (2) Weyl’s derivation based upon enumeration of all covariant differ- 
ential operations on the metric tensor that are linear in the second derivatives 
and that contain no higher derivatives; (3) Hilbert’s derivation from a varia- 
tional principle; (4) Cartan’s capture [44] of the geometrical content of the 
field equations [45]; and (5) the derivation by Gupta, Thirring, and Feynman 
Starting with the theory of a field of spin two and mass zero in flat space. 

The central starting point in the proposed derivation would necessarily 
seem to be ‘‘imbeddability.”” On the basis of experience one can reasonably 


274 JOHN ARCHIBALD WHEELER 


ask that the desired Hamilton-Jacobi equation, of course combined as always 
with the conditions of constructive interference, should pick out °’G’s that 
will fit together into a ‘*'¢. 

In what way would one violate the “*condition of imbeddability”’ if, 
for example, one left the differential operator unchanged in (13) but replaced 
the term ‘°’R by the square or by some other function of this curvature 
scalar? Without directly answering this question, one can say that the 
special form of the equation is governed in the most direct possible way by the 
four-dimensional character of space-time. When geometrodynamics ts put 
into simplest terms, it comes out as the statement 


e 


(CURVATURE) = (DENSITY OF MASS-ENERGY) 


In the present considerations one looks apart from situations where there is 
any ‘“‘real’’ mass-energy present. To write the equation as if there were such 
a term on the right is, however, a reminder that one is speaking about a ten- 
sorial quantity, dependent therefore not only upon one’s choice of point, but 
also upon the choice of direction, or unit 4-vector, at that point. This cir- 
cumstance illuminates the meaning of the curvature term on the left [44, 45]. 
It has to do with curvature of the 4-geometry in a tangent plane normal to the 
4-vector in question. Good! But we are considering conditions, not merely 
at one point, but at a threefold infinity of points, that form a spacelike 
3-geometry. This 3-geometry is not necessarily ‘* free of extrinsic curvature ”’ 
(vanishing “tensor K;,; of extrinsic curvature”’ or vanishing ** second funda- 
mental form’’) at the point under study. If itis, excellent! Then the desired 
curvature is given directly by the scalar curvature invariant ‘°)R of the geom- 
etry intrinsic to ‘°°'Y at the point in question. However, any nonvanishing 
‘extrinsic curvature of the 3-geometry relative to the enveloping 4-geometry ”’ 
makes an additional contribution to the scalar curvature intrinsic to °'Y. 
One has to correct for this contribution before one secures a proper measure, 


(Tr K)? — Tr K? +R 


of the curvature of the 4-geometry itself in the tangent plane in question 
(Gauss—Codazzi formula). Why this special bilinear expression in the tensor 
K;; of extrinsic curvature and not some other one [46]? The rationale shows 
most directly when one considers the special case where the 4-geometry itself 
is flat. In this case the “correction terms”’ in the expression for the desired 
curvature must exactly compensate ‘°’R. Let the 3-geometry be imbedded in 
the 4-geometry with * principal radii of curvature’’ at the point in question 
equal to p,, p2, and p,. Then the scalar curvature invariant of the intrinsic 
geometry is °°'’R = —2/p,p3 —2/p3p, —2/p,p,. The difference in sign here 
compared to the familiar °’R = 6/a? for a 3-sphere of radius a arises from the 


Superspace and Quantum Geometrodynamics 275 


fact that the ‘radius of curvature” is being measured in a 4-geometry of 
signature — + + +. The tensor of extrinsic curvature is 


— 0 0O 
P 
K=| 0 — 0O 
P2 
0 0 — 

P3 


It is desired to build up out of this tensor an expression that (1) is bilinear in 
the reciprocals of the p;; (2) contains no terms in 1/p,’, etc.; and (3) compen- 
sates ‘°)R. These three requirements lead uniquely to the stated formula. 
The considerations outlined here by no means come close to providing a 
derivation of the Einstein-Hamilton—Jacobi equation, but they perhaps 
suggest some of the factors that may play a natural part in such a derivation. 


Subscript on Relation of Hamilton—Jacobi Method to 
Conventional Analytic Solutions of Field Equations 


To look toward the Hamilton-Jacobi equation as an illuminating way to 
found general relativity is not to look away from the field equations as a way 
to solve problems in general relativity. The Hamilton-Jacobi equation is 
divorced from the equations of motion no more in geometrodynamics than in 
particle mechanics. Even in the most elementary problem the connection 
between the two methods is inescapable. Yes, one can solve Lagrange’s 
equation 
d*x OV ' 
di? x 


by direct numerical integration. Yes, one can translate the problem into 
Hamilton—Jacobi formalism, write down the equation 


a 2 
S-(RE)' +H 


2m] \Ox 


and integrate it by the most elementary numerical methods (replacement of 
(x,t) continuum by lattice space, f= mt, x =1nd, with m and n integers; 
replacement of partial differential equation by difference equation). How- 
ever, the most penetrating method to integrate the partial differential equation 
has long been known to go straight back to the Lagrange equation itself for its 


276 JOHN ARCHIBALD WHEELER 


start. Not everywhere does one at first evaluate S(x, ft), but only along a 
classical world line or history H, 


xX = X,(t) 

.  adxy(t) 

x = — 
dt 


thus 
t 
s(1) = S(xu(0,0 = | yma? — Voru(), O] at 
to 
or, in a more general problem, with Lagrange function L, 


t 
s(t) = SQxn(Q), 1) = { L(Xn, Xn, t) at 
to 
From a knowledge of S along the world line one goes to a knowledge of Sina 


narrow band on either side of the world line by using the relations 


RATE OF CHANGE OL 
OF ACTION S$ = (MOMENTUM) = ai (GENERAL) = mX,,(t) (HERE) 
WITH POSITION X . 


and 


RATE OF CHANGE OF OL 
—| ACTION S WITH = (ENERGY) = x — — L (GENERAL) 
TIME f ax 


= 4mxy? + V(xy(t), t) (HERE) 
Thus at the point 


x = xX,(t*) + dx 


t=1* +061 


a little way off the world line, the Hamilton-Jacobi function (‘‘ classical 
phase’’; f times phase of quantum-mechanical wave function) is 


Gh) 0s 
S(x, t) = s(t* + (2) ax + (2) ar 
(x, t) = s(t") ax m1 
An identical procedure gives S,.,(x, t), a new solution with infinites- 
imally different initial conditions, throughout a band extending on either 
side of an infinitesimally different world line, S,.,(x, t). The ‘‘ condition of 
constructive interference’’ between the new ‘‘ wave”’ and the old ‘‘ wave,”’ 


S(x, t) = Snew(X; t) 


Superspace and Quantum Geometrodynamics 277 


gives back at once the original solution of the equation of motion—and all the 
history H that goes with that solution. So too in geometrodynamics! 

One can use a known solution of the Einstein field equations to find 
the Hamilton-Jacobi function S(°’Y) throughout a certain narrow swathe 
through superspace. For this purpose one will find it easiest to start with one 
of the well-known analytic solutions of Einstein’s field equations. Let the 
4-geometry be written in the form 


ds* = gy, dx* dx? 


where the metric coefficients g,, are certain known functions of the four 
coordinates x". The 4-geometry expressed in this way summarizes the 
dynamical history H of the 3-geometry of space. It also defines a submani- 
fold—call it H—of the superspace Y. How now to determine the values 
taken on by the Hamilton-Jacobi function throughout the submanifold? 
First define the equivalent of the starting time f, in the one-particle problem. 
This equivalent is not a single number ¢,, but in general a different number 
to(x, y, z) for each point in 3-space; or better stated, it is a spacelike initial 
value hypersurface oy slicing through the given “4. Next, give the equiva- 
lent of the point X,(t9), fo along the classical history. It is the ‘‘ momentary 
state of the geometry of space’’ on the hypersurface o,; that is, it is the 
3-geometry “’Y, defined by the metric 


0x°\ (dx® dx° 
ds*(6o) = | G00( =) (=) a 240m) ar Sn dx™ dx" 
Ox™]) \ 0x" Ox" 


Similarly with the equivalent of the running time f¢ and the state x,,(t) of the 
particle at this time. Thus, give a spacelike hypersurface o by giving 
1 = t(x, y, z); and calculate the corresponding metric ds*(a), which defines a 
3-geometry °°’ *‘on the classical history H.”’ The value of the classical 
phase function S on this classical history is given by a fourfold integral. This 
integral is extended over the region of spacetime bounded by the two hyper- 
surfaces. It has the form 


s(o) = | " P(x") dy 


Here the Lagrange density ¥ is given, for example, by Arnowitt, Deser, and 
Misner ([2]; see also GMD). Consider now a “ point”’ °°’ in superspace a 
little ways removed from some “point” °’¢* that lies on the classical 
history H. What is the value of S for this new 3-geometry? The difference 
between the two 3-geometries is expressed most conveniently for the present 
expository purpose in terms of the difference of the metric coefficients at cor- 
responding points: 


O9 mn(X, ys Zz) = I mn Xs Yr2, G) -. I mn X; yy 2; a”) 


278 JOHN ARCHIBALD WHEELER 


This difference in 3-geometries has to be multiplied by the value of the con- 
jugate geometrodynamical momentum 2”"(x, y, x) at the point ‘°'G* on the 
classical history H (details of definition and calculation of 2” given, for 
example, in GMD) and integrated to give the change in the Hamilton- 
Jacobi function. Thus, throughout a thin swathe in superspace one can 
write 


S(°'Y) = s(ao*) + | "55m d?x 


PROBLEM 2. STRUCTURE OF SUPERSPACE 


Spacetime is the arena of particledynamics. Superspace is the arena of 
geometrodynamics. How different our knowledge of these two arenas! In 
particle dynamics the structure of spacetime is taken to be given as from on 
high: an everywhere flat ideal Minkowski-Lorentz manifold. In geometro- 
dynamics the structure of superspace is to be considered as defined entirely 
internally; that is to say, by the very form of the ‘‘ Einstein—Schroedinger 
equation”’ itself. We write this equation symbolically as 

V2 
- ag il 
To ask about the structure of this equation is therefore nothing more or less 
than to ask, what is the structure of superspace? 

It is conceivable that a given geometry may be more appropriately iden- 
tified with a whole class of points in superspace than with a single point, when 
that 3-geometry has less than maximal symmetry. In this way, Professor 
Stephen Smale kindly points out, one may keep from introducing “‘ conical 
singularities”? in Superspace. To avoid such singularities would seem to be 
essential if superspace is to rise from a mere topological manifold to the status 
of a differentiable manifold. 

Quantum theory once known, classical theory follows easily and 
uniquely, by way of the correspondence principle, 


va SLOWLY VARYING a iS 
~ \ AMPLITUDE FACTOR P h 


However, classical theory alone being known, it is ordinarily difficult or 
impossible uniquely to determine the form of the quantum wave equation [47]. 
No system better illustrates this point than a particle moving in a prescribed 
external electromagnetic field. To know the Hamilton-Jacobi equation 


alone 
OS  eA,\(OS | eA, Ee 
sala t S) (gat) + me =0 


Superspace and Quantum Geometrodynamics 279 


is to have no way to decide between the Schroedinger—Klein-Gordon equa- 
tion, the Dirac equation, or one or another wave equation of higher spin: 
they all reduce to the same Hamilton-Jacobi equation in the appropriate semi- 
classical limit. Similarly in geometrodynamics. From the well-defined 
** Finstein—Hamilton—Jacobi’’ equation (12, 13) 


VS \? 
(554) i 


no one knows a satisfying way to go to a unique “‘ Einstein-Schroedinger ”’ 
equation [48]. 

In the example of the particle most of the ambiguity about the form of 
the wave equation is resolved as soon as observation or other evidence 
reveals the spin [49]—or more directly—how many possible orientations there 
are for any spin degree of freedom over and above the three degrees of freedom 
that tell location in space. In other words, in addition to the Hamilton- 
Jacobi equation, one must know the “structure of configuration space”’ in 
order to end up with a well-defined wave equation. Hence our question 
here: What is the structure of superspace”? 

Unravel the structure of superspace? Hardly all in one jump, and 
hardly today! Instead a step by step penetration of the issues is more to be 
anticipated, if the history of electromagnetism or atomic structure or other 
branches of physics is any guide. At least three levels of analysis are to be 
perceived. First, one already knows the structure relevant to classical 
geometrodynamics: (a) Superspace has Hausdorf topology (Stern [4)]). 
(b) Superspace is endowed with an indefinite metric, defined by the expres- 
sion [50] 


| 
(5) Dj + GG jx — Giz Gui) 


that occurs in the Hamilton-Jacobi equation (Eq. (12)). Second, one searches 
for those features in the structure of superspace that come into play when 
Space changes its topology. Third, one hopes in the longer term to uncover 
those deeper properties of superspace that give ordinary space its dimen- 
sionality, its metric structure, and its ability to propagate electromagnetic 
fields and neutrinos with the speed of light. It is appropriate to discuss the 
structure of superspace a little further at all three of these levels. 


Level 1. Classical Geometrodynamics; Topology Does Not 
Change 


In classical geometrodynamics space does not change its topology. 
Space may be “preparing”’ to change its topology, but it cannot actually 
make the change within the context of classical theory [51]. It can at most 
Signal its ‘‘ intention” to change topology by developing somewhere a curva- 


280 JOHN ARCHIBALD WHEELER 


ture that increases without limit (“gravitational collapse’’ [18]). To go 
further with the analysis of the collapse phenomenon and treat changes in 
topology forces one to go outside the framework of classical theory. So 
long as one Stays inside classical theory, he must restrict his attention in one 
dynamical problem to one topology. Which topologies are acceptable? 

Of all topologies none 1s more familiar than that associated in Einstein’s 
theory with the geometry in and around a dilute center of attraction such as 
the sun. Curved at small distances, the geometry becomes asymptotically 
flat at large distances. The topology, as distinguished from the geometry, is 
Euclidean: E,; or Rx RXR. In a broader context, however, Einstein 
thought of the space around one star as part of the space engulfing all stars 
[52], and of this total space as closed and endowed with the topology of the 
3-sphere, S;. No one has stated more strongly than he the arguments for 
considering space to be closed [53]. 

One acquires a new reason to consider space to be closed when he con- 
siders how hard it is to define *‘an open and asymptotically flat space’’ in the 
context of quantum geometrodynamics: Those ‘°)4’s that occur with over- 
whelming probability are everywhere endowed with all kinds of ripples and 
other geometrical structures at the scale of the Planck length. There is no 
direction that one can take and there is no distance that one can travel that 
will erase this structure. Under these circumstances it is difficult to attribute 
any well-defined meaning whatsoever to the term ‘‘ asymptotically flat.” On 
the other hand, one knows more than one example of a space that is open and 
not asymptotically flat, and that becomes wilder and wilder at great distances 
[54]. Not having any means to distinguish one open space from another as 
‘good,’ one would seem justified at this stage in the development of the 
subject to exclude from attention all open spaces. This approach is the more 
attractive in that it appears possible (55] to distinguish between one type of 
closed 3-geometry and another by straightforward classificatory integers 
similar to the Betti numbers. In contrast, the concept of ‘‘ asymptotically 
flat’”’—even in contexts where it is relevant—is much more complex to form- 
ulate [56]. 

Out of a manifold with the topology E; it is possible to cut out a block 
with the shape of a cube and obtain the topology of the 3-torus, S, x S; x 5S. 
On a manifold with this topology the initial value problem of classical geo- 
metrodynamics presents difficulties in certain cases, according to unpublished 
considerations of Brill and Avez. It is conceivable that these difficulties 
indicate that the topology S, x S, x S, is not acceptable. In that event one 
might almost say that this topology has inherited a “‘ defective gene”’ from its 
parent topology £,. 

Another “* black gene’”’ that one can perhaps reasonably exclude is non- 
orientability. We assume in effect that ‘“‘transport’’ of a right-handed glove 
shall never bring it back to its starting point left handed. 


Superspace and Quantum Geometrodynamics 28 | 


Compatible with these very tentative principles for selecting ‘‘ accept- 
able topologies’* are the 3-sphere (S;); the 3-sphere with addition of one 
handle or ‘“‘ wormhole” (S, x S, = W,); and the 3-sphere with # wormholes 
(W,,). There may or may not be further acceptable topologies not included in 
this list [55]. 


Fixed Topology Excludes GMD Account of Particles and Fields 


Taking the most familiar of these ‘‘ acceptable topologies,’ S;, as case 
example, one can discuss and treat quantitatively at the classical level a rich 
variety of physical processes, including gravitational radiation [38]; gravita- 
tional geons [20]; the planetary motion, collisions, and breakup of geons [20]; 
the Taub model of an expanding and recontracting universe [57]; and more 
complex model universes [38]. Even so, one is limited in the physics that he 
can include in any of these models. One has no place for particles. Norcan 
one treat classically the final stages of gravitationa] collapse. These limitations 
seem unrelated to each other. In classical theory they are unrelated. In that 
view, the geometry of space remains forever tied to a unique topology. Not 
so in the purely quantum-geometrodynamical model of physics. There (1) a 
particle is pictured in terms of space resonating from one topology to another 
[58]; and there (2) the final stages of gravitational collapse are viewed as a 
coupling of macroscopic motion and microscopic topology (‘‘ waterfall and 
foam’’). In excluding from consideration all changes in topology, classical 
theory, on these views, also excludes any account of the rationale of particles. 
Particles have to be introduced as foreign and physical entities and space has 
to be viewed as arena, not as structural material. In this limited conceptual 
framework the particles and other fields are counted as having degrees of 
freedom over and above those of the geometry. The electromagnetic field, in 
particular, is viewed as an entity additional to geometry. 

Attempts have been made to regard electromagnetism as an aspect of 
geometry. This enterprise limits itself to that classical framework of ideas 
where one looks apart from submicroscopic fluctuations in the geometry and 
the topology of space. One attempt has had some minor success. The 
second-order equations of Maxwell and the second-order equations of Einstein 
have been combined into one set of equations of the fourth order that make no 
reference to any but geometric magnitudes [59]. The Maxwell field is recog- 
nized by the ‘“‘footprints’’ it leaves on the geometry of space. In a certain 
sense those ‘footprints’ are the electromagnetic field; hence the name, 
‘already unified theory”’ of gravitation and electromagnetism. Faithfully 
though this theory reproduces in other respects the dynamic content of the 
equations of Maxwell and Einstein, it turns out not to be adapted to treat the 
Initial value problem. Geometrical measurements alone on an initial space- 
like hypersurface do not always suffice completely to determine the future time 


282 JOHN ARCHIBALD WHEELER 


evolution of the: geometry [60]. In other words, one cannot uphold in 
‘* already unified field theory ’”’ a strict division of dynamics into ‘‘ equations of 
motion” and ‘‘initial value data for these equations.” Yet on that division 
one had learned to insist, not least by reason of hard-won lessons from the 
Hamilton-Jacobi theory and from quantum theory. Therefore the electro- 
magnetic field, like a particle, can hardly be treated as anything but a foreign 
or ‘‘ physical”’ entity immersed in space so long as one looks away from the 
microscopic geometry of space. 


Formalism of Field When Treated as ‘‘ Foreign and Physical” 


In this nongeometric or “‘ arena’ approach to physics one considers the 
degrees of freedom of the ‘‘ foreign and physical’”’ entity as additional to those 
of the geometry. In the most elementary example, the case where only one 
such entity is contemplated, the pure source-free electromagnetic field, one 
writes the Hamilton-Jacobi function or the Schroedinger probability ampli- 
tude, as the case may be, in the form 


S= S(Gin(x, J» Z), AAX, Ys z)) 
or 


W a W(Gix» An) 


Similarly when more fields are involved. Even at this level of analysis the 
geometrical approach has a contribution to make. The Hamilton-Jacobi 
function, ostensibly dependent upon the individual components of the metric 
and of the electromagnetic vector potential, actually has to be understood to 
depend only on the 3-geometry “’G and on the 2-form B=dA. Conse- 
quently it cannot change S to “‘ push the rubber sheet on which the coordinates 
are painted’”’ over space in such a way that the point P, which formerly had 
the coordinates x‘, now acquires the coordinates x‘ — €'. Similarly the point 
P + dP, which formerly had the coordinates x' + dx', now acquires the coor- 
dinates 


; ; oe! 
x'+dx'— €'— (=) dx! 


The separation between the two points of course remains unchanged by this 
alteration in coordinates: 
ds* = g,(x°) dx' dx! 
= gi; "(x — S*)(dx' — E, dx™)(dx! — f!, dx") 


Superspace and Quantum Geometrodynamics 283 
To terms of the first order in the displacement &* the alteration in the metric 
coefficients is 


0g; Be 
Bay = ats" — a1 = (22) e +; 5+ ¢;, r= ipsa t oii 


where the subscript | / denotes covariant differentiation, in the space of the 
3-geometry, with respect to the jth coordinate. Ina similar way one finds the 
alteration in the electromagnetic vector potential, 


0A; = Aj’ — A; = Ai .¢ + A; ei, 
These changes in the metric coefficients, taking place throughout all space, 


ostensibly produce a change in the Hamilton-Jacobi function determined by 
the functional derivatives of S; thus, 


5-1) au (8) a] 


However, the mere shift in coordinates cannot produce any real physical 
change. Consequently the quantity 6S must vanish, independent of the 
choice of the coordinate displacements ¢'. The consequences of this coor- 
dinate invariance are easily traced out. The functional derivatives that come 
into consideration have well-defined physical interpretations. Thus 


ri = oS 

09i; 
is the geometrodynamical momentum [2, 3] conjugate to the geometro- 
dynamical field coordinate g;;. It 1s closely connected with the ‘extrinsic 
curvature”’ K‘/ of the 3-geometry with respect to the yet-to-be constructed 


4-manifold. The other derivative, 


~ 6A, 


the electrodynamic momentum conjugate to the vector potential A,, is 
nothing other than the electric field. This quantity satisfies the divergence 
condition [61] 

C= 0 
The condition that S be invariant with respect to coordinate changes takes 
the form 


O00 [ in%é, 1g + Si) + ECA, 7 + A; Si] aPx 


Integrating by parts, readjusting the positions of indices as appropriate, and 
making use of the divergence relation, one finds 


O= [[—2nl3 + €(A, ;-— A; )1e dx 


284 JOHN ARCHIBALD WHEELER 


This must vanish for arbitrary choices of the field of coordinate displacements 
EJ, Consequently the quantity in square brackets must vanish. One finds in 
this way those three of Einstein’s field equations that link the curvature of 


space with the Poynting density of flow of electromagnetic field energy [62]: 


How remarkable that one gets so much—including the Poynting vector 
itself—from the elementary condition that S should depend, not upon the 
components g;; and 4; individually, but only upon the coordinate-independ- 
ent geometrical quantities that are ‘‘dressed up”’ in these components [63]! 

Express electrodynamics in geometric language, yes. See the foot- 
prints of the electromagnetic field on the geometry of space, yes. Conceive of 
classical geometrodynamical electric charge as lines of force trapped in the 
topology of space, yes. But explain geometrically the *‘ necessity” for elec- 
tromagnetism and other “‘ physical”’ entities, no. Not within the context of 
classical geometrodynamics and its never-changing topology. 


Level 2. Space Resonating between 3-Geometries of Varied 
Topology 


A new world opens out for analysis in quantum geometrodynamics. 
The central new concept is space resonating between one foamlike structure 
andanother. For this multiple connectedness of space at submicroscopic dis- 
tances no single feature of nature speaks more powerfully than electric charge. 
Yet at least as impressive as charge is the prevalence of spin $ throughout the 
world of elementary particle physics. ‘It 1s impossible to accept any de- 
scription of elementary particles that does not have a place forspin}.’ This 
quotation from the book Geometrodynamics [64] of 1962 goes on to say, 
‘* What then has any purely geometrical description to offer in explanation of 
spin $ in general? More particularly and more importantly, what possible 
place is there in quantum geometrodynamics for the neutrino—the only 
entity of half integral spin which is a pure field in its own right, in the sense 
that it has zero rest mass and moves with the speed of light? No clear or 
satisfactory answer is known to this question today. Unless and until an 
answer is forthcoming, pure quantum geometrodynamics must be judged 
deficient as a basis for elementary particle physics.’ Happily the concept of 
spin manifold [65] has subsequently come to light, not least through the work 
of John Milnor. This concept suggests a new and interesting interpretation 
of a spinor field within the context of the resonating microtopology of quantum 
geometrodynamics, as the nonclassical two-valuedness that attaches to the 
probability amplitude for otherwise identical 3-geometries endowed with alterna- 
tive “spin structures.” 


Superspace and Quantum Geometrodynamics 285 


The Orientation Entanglement Relation or ‘‘ Version” 


Spin and topology: what is the connection? Take a cube (Fig. 7). To 
its upper northeast corner attach one end of a long elastic string. Run the 
other end to the upper northeast corner of the room and fasten it there. 
With seven other elastic strings attach the other corners of the cube to the 
corresponding corners of the room. Now select any axis running through the 
center of the cube and rotate the figure about that axis through 360°. The 
cube resumes its original configuration. Not so the strings. They are in a 


Ey 


<> 


VV 


FIGURE 7. The elastic strings attached to the central cube keep account of its 
orientation entanglement relation with its surroundings. 


tangle. Moreover, rejecting cuts, one has no way to untangle the strings. 
Consequently it needs more than orientation to tell the relation between the 
cube and its surroundings. The necessary bookkeeping is provided by a 


spinor, 
“ 
S= 
y] 


Under a rotation through the angle 0 about an axis making angles a, f, y with 
the x, y, and z axes this spinor is transformed to the new spinor s’ = qs. Here 


286 JOHN ARCHIBALD WHEELER 


the quaternion or “versor”’ q has a value indicated in various ways in various 
well-known systems of nomenclature: 


q=cos+0@+sin 4 O(icos a + jcos B + k cos y) 
(with ij = —ji = k, etc.) 
= cos $ 0 — isin } O(a, cos « + a, cos B + a, cos y) 
(with i=(—1)'/? and 0,6, = —o,0, = io,, etc.) 


(cos $ 6+ isin 4 0 cosy) sin + O(cos B — i cos a) 


sin + 0(—cos B — i cos «) (cos $ 0 — isin 4 0 cos y) 


The important point is the change of sign under a 360° rotation: 
q(360°) -s = —s. The 360° rotation alters what one may most appropriately 
call the ‘‘ orientation entanglement relation’’ between the cube and its sur- 
roundings—or, more briefly, the “‘version”’ of the cube. The spinor keeps 
account of this orientation entanglement relation. Two successive rotations 
by 360° restore the cube to its original orientation entanglement relation with 
its surroundings. The strings at first appear to be tangled up with twice the 
twist they had before. Nevertheless, they can now be untangled completely, 
as one confirms by direct trial or by elementary reasoning [66, 67]. 
Arbitrarily pick out one way of placing the cube, call it the *‘ standard 
orientation entanglement relation’’ between the cube and its surroundings, 
or the standard “version”? of the cube, and associate with it the spinor 


0 <— . 
(1). Proceed similarly at other points of space, taking care only that any 


Changes in the standard version from point to point shall take place 
smoothly. 

A special situation develops when an orientable 3-geometry [68] is 
endowed with a handle or wormhole. The cube can be transported in 
imagination from A to B ‘‘through the surrounding nearby space”’ or 
‘through the wormhole.’ With one cube possessing a certain pattern of 
colored faces follow one route and with another identically colored cube 
follow the other route. It makes physical sense to ask if the two cubes have 
at B identical orientation entanglement relations to their surroundings [67]. 
If they do not, alter the definition of the standard orientation within the worm- 
hole by a rotation. Let this rotation increase continuously from zero at the 
mouth of the handle near A to 360° at the mouth near B. At each point out- 
side the wormhole a triad of axes a, b, c defines the direction of the axes of 
the cube when the cube is located at that point in its standard orientation. 
These triads are not affected by the alterations made inside the wormhole. 
They vary in direction as smoothly as ever from point to point. At the two 
mouths of the wormhole they join on as smoothly to the new field of triads 
inside as they did to the old field of triads. Now, however, one has at last 


Superspace and Quantum Geometrodynamics 287 


achieved an everywhere continuous field of standard orientation entangle- 
ment relations. The associated spinor field is likewise continuous, having 


0 . 
everywhere the standard value (‘). whereas before it underwent somewhere a 


discontinuous change from this value to its negative. One can restate in 
mathematical language what has been accomplished. First, one has laid 
down a “‘spin structure’’ upon the manifold. This structure is defined by 
the field of triads, or by ‘‘the class of such fields that are equivalent under 
homotopy.” Second, one has laid down a “spinor field’? upon the manifold. 
This spinor field is defined with respect to the given ‘‘spin structure.’ Other 
spinor fields can of course be laid down, with components also varying con- 
tinuously from place to place. 


Alternative Spin Structures in a Multiply Connected Space 


Start again with another closed orientable 3-manifold endowed with the 
original topology and metric. One might hope to obtain an acceptable spin 
structure on this new manifold by taking over to it the identical field of triads 
that serves for the original manifold. It may be that this supposition is 
correct about the particular new manifold that one happens to have picked 
out. In this event identical colored cubes *‘ taken from A to B by inequivalent 
routes’’ will preserve their orientation entanglement relation, one to the 
other. However, the supposition can equally well be incorrect. If so, the 
consequences are direct. The two cubes, carried by different routes from A 
to B, always in alignment with the canonical field of triads, will end up at B 
inequivalent to the extent of a 360° rotation. Moreover, this inequivalence 
in the orientation entanglement relation with the surroundings can be detected 
in principle by direct physical measurement [67]. In other words, the spin 
structure of the original manifold does not apply to the new manifold. To 
serve in the new manifold, the field of triads has to be modified to the extent of 
a 360° rotation within the wormhole or by an equivalent change. The dif- 
ference between the new manifold and the original manifold expresses itself in 
these terms: the two manifolds have the same topology and metric, but they 
have inequivalent spin structures. The difference between the two manifolds is 
not only mathematical. It is physical. 


The Multisheeted Character of Superspace 


One does not classify the closed orientable 3-manifold of physics com- 
pletely when he gives its topology, its differential structure, and its metric. 
He must tell in addition which spin structure it has. The spin structure, like 
the metric, lends itself in principle to observation. Consequently it is not 
enough for the purposes of quantum geometrodynamics to give the probability 


288 JOHN ARCHIBALD WHEELER 


amplitude for a “‘ 3-geometry’”’ as the term 3-geometry was previously under- 
stood. One must introduce a new two-valued descriptor, ,, for each (k = 1, 
2,..., ”) of the nm wormholes of the manifold, to distinguish the two inequi- 
valent ways to lay down aspin structure in the interior of that wormhole. The 
new, enlarged concept of a 3-geometry ‘°’G adjoins these descriptors to the 
continuous infinity of parameters which alone served previously to distinguish 
one 3-geometry, ‘°’G°'", from another; thus, 


3 — (3 old. 
(IG x (GM: wi, Wo, ., Wa) 


and 
W(PG) = WEG"; wy, Wo, 2-5 Wp) 


In other words, superspace acquires a multisheeted character (Fig. 8) with 2" 
distinct sheets in that region of superspace where the 3-geometry is endowed 
with n wormholes. 

For the two values to be assigned to the descriptor 11, it is natural to pick 
+1 and —1. However, there is nothing anomalous about the one spin 
structure as compared to the other. No canonical way has ever been pro- 
posed to give preference to one as compared to the other. Therefore it is a 
matter of arbitrary choice to which spin structure to assign the descriptor +1, 
and to which, —1. 

The words “‘spin structure’’ can mislead. They suggest that there is 
something special about “laying down a spinor field’? upon the manifold. 
They conjure up visions of laying down upon the 3-geometry other kinds of 
fields that transform according to groups other than the spinor group SU(2); 
for example, SU(n) or SL(n). However, the relevant point in the whole 
analysis is not the field of spinors, but the field of triads and their orientation 
entanglement relations. There is nothing inthe concept of spin structure that 
One could not have conveyed, with less chance of being misunderstood, by 
using the phrase “‘triad structure.’’ Moreover, there is not the slightest 
indication that there is any other structure of a closed orientable 3-manifold 
that remains to be brought to light. Consequently we take °°’ in its new and 
enlarged sense to be the full indicator of the configuration of space, and as 
containing the full set of variables upon which w depends. In other words, 
we take the new °°’ to comprise a set of commuting observables, complete in 
the sense of quantum mechanics, and therefore suitable for analysis of the 
probability amplitude yw. 

We do not add a spinor field to geometry. Quite the contrary. We 
take a spinor field away from geometry. A 3-geometry, augmented by a 
spin structure (as might for example be indicated by the descriptor 
(1, Wo, eee, Ws) = (+1, +1, +1, —1, +1)) ts a possible habitation for a 
spinor field—but we have thrown out the inhabitant [69]. 

Spin 4, if it occurs naturally in the context of quantum geometrodynam- 
ics, can hardly occur in any other sense than that in which Pauli spoke of 


Superspace and Quantum Geometrodynamics 289 


SiG) h, 


Pe, 
1 iS(G)/h 

Ae 

FIGURE 8. The multisheeted character of superspace. 


spin from the very beginning, as a ** nonclassical two-valuedness.”’ Moreover, 
a nonclassical two-valuedness is already inescapable in the formalism. There 
are separate probability amplitudes for a 3-geometry with descriptor 1, = +1 
and for an otherwise identical 3-geometry with descriptor 1, = —1. Does 
this circumstance imply that quantum geometrodynamics supplies all the 
machinery one needs to describe fields of spin } in general and the neutrino 
field in particular? That is the proposal. That is the only way that has ever 
turned up within the framework of Einstein’s general relativity and Planck’s 
quantum principle to account for spin. Is this the right path? It is difficult 
to name any question more decisive than this in one’s assessment of ‘‘ every- 
thing as geometry.” 


Electromagnetism as a Statistical Aspect of Geometry? 
Other Questions 
It would be tempting to stop with this major issue if it did not open the 


door to so many tributary questions. (1) When a new handle develops and 
the number of descriptors rises by one, what boundary condition in super- 


290 JOHN ARCHIBALD WHEELER 


space connects the probability amplitude wy for 3-geometries of the original 
topology with the probability amplitudes y, and w_ for the two spin struc- 
tures of the new topology? (2) One may wish to describe for each newly 
developing wormhole something like “‘ the fractional amplitude going from w 
into w,.” However, with the fantastic number of wormholes that typically 
come into consideration (~ 10??/cm?*) any such individual bookkeeping would 
seem for many purposes to be out of the question. It is equally impractical 
to keep account of the orientation of each of the ~ 107° spins in a ferromagnet. 
It is much more appropriate to speak of the ‘‘ density of magnetization” and 
of perturbations in that density carried by ‘‘ magnons” [71]. What are the 
analogous quantities and concepts in geometrodynamics? What statistical 
approach is best suited to keep account of the many ‘“‘ matching ratios”’ 
w/w associated with all the nascent wormholes in the geometry? (3) One 
speaks of ‘‘ magnetization,” knowing well that the term ‘‘ magnetization”’ has 
not the slightest real meaning at subatomic distances. Is the term “‘electro- 
magnetic field’’ equally without submicroscopic physical significance? In 
other words, among the statistical parameters most appropriate for keeping 
account of the 10°° ‘‘ matching ratios” per cubic centimeter, is there one set of 
statistical parameters that one can identify with the electromagnetic field ? 


The Example of Two-Geometries 


No one can ask about the physics of changes in the topology of 3-space 
without at least a look at the mathematics of changes in the topology of 
2-space. The superspace built on all 3-geometries is like no mathematical 
object so much as the superspace ** built on all 2-geometries.”” No one did 
so much to bring this mathematical object to light as Riemann, the same 
Bernhard Riemann who taught that the curvature of space is a branch of 
physics, and who provided the mathematical machinery to describe not only 
curvature (the Riemann curvature tensor, R,,.5) but also topology (the Betti 
numbers R,). 

In his 1857 paper Riemann noted that all algebraic 2-geometries 
endowed with the topology of a 2-sphere are equivalent to each other under 
conformal transformation (multiplication of all three metric coefficients by a 
common position-dependent factor 2). In other words the equivalence class 
of conformally equivalent 2-geometries of the given topology (.S,; or ‘‘ genus 
g = 0’) consists of a single object. It therefore constitutes a single “‘ point”’ 
in what is not really a superspace itself, as we have been using that term, but a 
‘‘reduced superspace”’: reduced in the sense that the ‘‘(0o7)” ” degrees of 
freedom in 2 have here been “strained out” of superspace. 

Two-geometries with the topology of the torus (7,; or one wormhole 
W,; or ‘“‘genus g = 1’), Riemann showed, are not all equivalent to one 
another under conformal transformation. Instead, when conformally 


Superspace and Quantum Geometrodynamics 291 


equivalent 2-geometries of the topology 7, are identified, the family of objects 
that results is a complex continuum of dimension | (two real dimensions) [72]. 
Two-geometries endowed with a larger number of wormholes (topology W,; 
genus g > 2), after extraction of the conformal degrees of freedom, reduce to a 
family of objects described by 3g — 3 complex parameters (6g — 6 real param- 
eters). “Reduced superspace,” built on the totality of conformally 
equivalent closed orientable 2-geometries of all topologies, thus appears to 
consist of a series of disjoint spaces, the first of dimension 0, the second of 
dimension I, the next of dimension 3, the next of dimension 6, and so on. 
However, great developments in the analysis have taken place since the days of 
Riemann through the efforts of many investigators [73]. Today, thanks not 
least to the works of Lipman Bers, one knows how to define one single infinite- 
dimensional reduced superspace in which all these apparently disparate parts fit 
smoothly together [74]. What a model for the mathematical treatment of the 
superspace of general relativity! Yet, at the risk of seeming overdemanding, 
one has to ask for more. The superspace of physics is not to have any 
‘“conformal factor strained out of it’’; rather, it is if anything to be enlarged, 
so as to include all the descriptors 1, of the spin structure. How fit all the 
pieces of this superspace smoothly together? Challenging problem, at the 
very heart of quantum geometrodynamics! 


Other Aspects of Superspace 


Why all this emphasis on the structure of superspace? Why not simply 
spell out explicitly the form of the ‘ Einstein—Schroedinger equation”? For 
the more elementary problem of a particle moving in flat 3-space it was 
straightforward for Schroedinger to derive his wave equation. Simple con- 
siderations of invariance with respect to translation and rotation show that V 
is the only simple differential operator that can come into play. That clear, 
the principle of correspondence with classical physics gives all the rest. One 
can hope that equally compelling considerations will fix the detailed mathemat- 
ical form of the expression that we write down so far only symbolically, 


Vy 


However, a precise formulation of such considerations would seem to be out 
of reach until one has a knowledge of the tranformations of superspace com- 
parable to one’s knowledge of the transformations of 3-space. Hence the 
emphasis on the structure of superspace. 

A problem of such depth can hardly be examined from too many 
points of view. Six more aspects of superspace seem worthy of mention. 
One can summarize them under the names “‘ metric,” “‘ residual causality,” 


292 JOHN ARCHIBALD WHEELER 


99 66 99 ¢¢ >) 


‘initial value,’ ‘‘conjugate momentum,” “‘collapse,”’ and ‘‘ pregeometry.”’ 
(1) There could hardly be a more helpful guide to the structure of superspace 
than the metric that obtains in it, 


| 
(5) Djt+ Gi jx — Gi Iu 


Reference is made to [2] for the most illuminating discussion of this metric 
given to date. (2) This metric has a “‘light cone”’ associated with it. This 
light cone makes propagation proceed anisotropically in superspace. This 
anisotropy differs in character, however, from place to place according as 
‘9)R is positive or negative. This anisotropy imposes a kind of ‘residual 
causality”’ upon superspace. Charles W. Misner has pointed out in a con- 
versation that one cannot forget this residual causality when one says that the 
customary ideas of “ before” and ‘‘after”’ lose their meaning at the scale of 
the Planck length. Perhaps one can say more: If the principle of causality 
has been of service in analyzing the structure of flat spacetime, it can hardly 
fail to help in studying the structure of superspace. 


Tangent Vectors on Superspace and the Classical Initial Value 
Problem 


(3) In the classical mechanics of a particle one is accustomed to speci- 
fying freely x, and (dx/dt)). These initial conditions determine the whole 
future history of the particle. What are the analogous freely disposable 
initial value data of classical geometrodynamics? One is tempted to say: 
conceive of a continuous one-parameter family of 3-geometries, specified for 
example in one coordinate patch by the 6 metric coefficients g;,(x, y, z; A); and 
use this one-parameter family to define data analogous to x, and (dx/dt), in 
the one-particle problem; thus, ‘°’G, stands for the class of metrics equivalent 
to gi(x, y, Zz; 0); and (d@'G/dA), is the “‘tangent vector in superspace” 
defined by [09,,(x, y, z, 4)/02],-.9, modulo the group of coordinate transforma- 
tions. 

It is easy for a mere coordinate shift to mock up the appearance of a 
change in geometry. Let the coordinates be shifted so that the point P, for- 
merly characterized by the coordinates x', is now characterized by x' — 2€', 
with the vector field €' a continuous function of position. The metric g;, 
is altered by this shift to 


Fix + MGi pa + SK ya) 


The derivative (d°'G/dA) is €;,;, + &);, modulo the group of coordinate 
transformations. But this quantity, by reason of its very origin, is obviously 
annullable by a coordinate transformation. Consequently there is in this 


Superspace and Quantum Geometrodynamics 293 


case no real change in the geometry. In other words, one cannot admit any 
otherwise reasonable-looking field of values for (0g;,/0A)) without running 
the risk of deception. One wants what has been called in [2] (GMD & IFS) a 
3-geometry (of “acceptable”? topology) and another ‘‘ nearby’? 3-geometry 
in order—one trusts (central hypothesis of the subject!)—to be able to deter- 
mine the entire past and future of the space, and thereby a complete 4-geometry. 
But if one has been ‘‘ deceived ”’ in describing what is ostensibly a second and 
‘nearby’ 3-geometry, he may merely be repeating all over again the pre- 
viously given 3-geometry. In that event he has only -half the amount of 
initial value data needed to predict the dynamics. One can restate the situa- 
tion in the following terms in the context of the 4-geometry (regarded tem- 
porarily as known!). A spacelike slice is made through the 4-geometry. 
That gives the one 3-geometry demanded as one of the essential ingredients 
of the initial value data. However, that one slice is not adequate to distin- 
guish the given 4-geometry from any number of other, different 4-geometries 
that admit as slice the same 3-geometry. To complete the selection of the 
given 4-geometry from these alternative 4-geometries, erect vectors An” at each 
of the points of the 3-geometry, with n* a continuous function of position. 
Their tips define a new hypersurface, the coordinates in which are connected 
continuously with the coordinates in the original hypersurface. Evaluate 
the metric coefficients g;,(x, y, Zz, 4) on this hypersurface. This hypersurface 
can be said to have been “‘ pushed forward ”’ with respect to the original hyper- 
surface. Buthasit? Yes, if the normal component of An* nowhere vanishes. 
However, Hans Ohanian and Elliot Belasco have emphasized in unpublished 
remarks that it may happen that there are whole regions of the hypersurface 
where the normal component of An* vanishes. In that case one has not really 
pushed the hypersurface ahead at all in “Y. In this event the second com- 
ponent of the initial value data, the derivatives (0g;,/0A)), will simply be 
inadequate for the purposes of the elliptic initial value equations [75]. This 
situation will be signaled by the fact that (0g;,/04)) can be written in the form 
cite + Sxyi- In other regions the 3-geometry wi// have been pushed forward 
in time. This situation will be signaled, except in special circumstances 
(time-symmetric initial value problem; change in g;, proportional to 4? rather 
than A; situation covered by appropriate care in the formulation), by the fact 
that (0g;,/0A)o (or (0g;,/047)> in special cases like the time-symmetric initial 
value problem) is not representable in the form ¢;,, + ¢x|;- 

It is natural to try to summarize the whole situation in the following 
form. Give a point in superspace and give a “‘ fully developed direction” at this 
point in superspace. Then (hypothesis!) this information is sufficient, together 
“ith Einstein’s equations, uniquely to determine the entire 4-geometry. Here 
the term ‘‘ point in superspace”’ implies, as earlier, the demand that the 
3-geometry in question have acceptable topology. The term “‘ fully developed 
direction” implies that there is no point on the 3-geometry where the quantity 


294 JOHN ARCHIBALD WHEELER 


(0g;,/0A)> (or if it vanishes, the quantity (¢g;,/027))) can be expressed in the 
form €;,;, + &,,;- In brief, does superspace provide a new approach to the 
classical initial value problem? And in turn, does that initial value problem 
throw new light on the concept of ‘‘ direction” in superspace? 

(4) Superspace is built on the concept of 3-geometry; but dynamically 
conjugate to 3-geometry is the geometrodynamical momentum, with com- 
ponents z;;. Out of these objects, with all the varied topologies that they can 
have, one can build a “‘conjugate superspace.”” What are its properties? 

(5) No crisis stands out more insistently in all of physics than gravita- 
tional collapse. No topic connects so immediately the world of the very large 
and the very small. What insights can one gain from the concept of super- 
space into the cause and consequences of gravitational collapse? 

(6) How far can one go in analyzing the properties of superspace with- 
out getting into the problems of ‘‘ pregeometry’”’? 


Level 3. Pregeometry 


Weyl! remarks [76] ‘‘... a more detailed scrutiny of a surface might dis- 
close that, what we had considered an elementary piece, in reality has tiny 
handles attached to it which change the connectivity character of the piece, 
and that a microscope of even greater magnification would reveal ever new 
topological complications of this type, ad infinitum.’ Under such circum- 
stances it would seem difficult to uphold the concept of dimensionality at the 
smallest distances. General arguments [77] emphasize the same point. 
Moreover, if electromagnetism and other fields have to do with the quantum- 
mechanical resonance of space between one topology and another, why 
should not the concept of metric itself be likewise a derived concept, going 
back for its foundation to topological or pretopological—and at any rate to 
pregeometric—ideas (*‘ distance between A and B”’ being defined in the last 
analysis, for example, by ‘‘ the ramification of the connections between A and 
B”’)? One cannot even mention these topics without recalling the universal 
sway of the quantum principle, and without stressing the ‘“‘ order of creation” 
as one thinks of it from physical evidence: Not first geometry and then the 
quantum principle, but first the quantum principle and then geometry! 

It is enough to raise these issues, with all their depth, to see into what 
difficulties one can get with quantum geometrodynamics if one tries to think of 
it as an “ultimate’’ theory. However, physics has never depended for its 
progress on having an ultimate theory. There is no reason to think that the 
situation is different today. While one can raise ultimate issues of all kinds, 
there is no reason to believe that they all have to settled now! Nor that one 
can resolve them now! The subject presents an ever widening list of issues 
that have lively physical interest and lend themselves to well-known methods 
of analysis [78]. 


Superspace and Quantum Geometrody namics 295 


PROBLEM 3. INITIAL CONDITIONS 


The classical initial value problem has already been discussed. What 
can one say about the corresponding problem in quantum geometrodynamics ? 
In other words, how much information must one give about (‘°'Y) on an 
appropriate submanifold of superspace in order to be able to predict this 
probability amplitude everywhere in superspace? And what is the character 
of this submanifold? In this connection one recalls that the ** Einstein 
Schroedinger wave equation”’ is of the second order. This second-order 
character raises a question of principle: In order to be able to calculate w 
everywhere, must One know on a hypersurface of superspace not only but 
also its normal derivative? No, Leutwyler suggests in a most interesting 
paper [79]. He points out in the context of a simplified model that the 
natural features of superspace itself impose certain natural boundary condi- 
tions. They reduce the effective order of the equation from second to first. 

Wider questions of principle are also posed by the very structure of 
quantum geometrodynamics. The arena of the dynamics is not space, but 
superspace. At first this development seems preposterous. How can one 
speak sensibly of any physical predictions when the outcome depends on 
what is taking place in unreachable regions of superspace? Nothing could 
seem more at variance with the spirit of science as dealing only with the know- 
able. However, a closer look shows that one has broken not at all with the 
traditional spirit of dynamics, but only with the details. In classical dynam- 
ics a Clean distinction has always been maintained between (1) the equations 
of motion, which one can hope to know and understand, and (2) the origin of 
the initial conditions for those equations of motion—which is beyond one’s 
power to investigate [80]. Quantum geometrodynamics maintains a similar 
cut between the knowable and the unknowable, but the cut comes in a new 
place [81]. Nothing seems to exclude the possibility ultimately to know (1) 
the detailed form of the Einstein—Schroedinger equation, and the concomitant 
structure of superspace; but as for (2) the source of the initial conditions on y, 
that would seem as far as ever beyond one’s power ever to know. Happily, 
in neither classical dynamics nor quantum geometrodynamics does one have 
to know all initial conditions to make useful predictions! On the contrary, as 
Wigner has so often stressed [82], the role of physics is to predict the correla- 
tions between observations. 


REFERENCES AND NOTES 


1. L. Rosenfeld, Annalen der Physik 5, 113 (1930) and Z. Physik 65, 589 (1930) 
and Annales de I’ Institut Henri Poincaré 2, 25 (1932); P. G. Bergmann, Phys. Rev. 75, 
680 (1949); P. G. Bergmann and J. H. M. Brunings, Rev. Mod. Phys. 21, 480 (1949) ; 
Bergmann, Penfield, Schiller, and Zatzkis, Phys. Rev. 78, 329 (1950); P. A. M. Dirac, 


296 JOHN ARCHIBALD WHEELER 


Can. J. Math. 2, 129 (1950); F. A. E. Pirani and A. Schild, Phys. Rev. 79, 986 (1950); 
P. Bergmann, Helv. Phys. Acta, Suppl. IV, 79 (1956), Nuovo Cimento 3, 1177 (1956) 
and Rev. Mod. Phys. 29, 352 (1957); C. W. Misner, Rev. Mod. Phys. 29, 497 (1957); 
B. S. Dewitt, Rev. Mod. Phys. 29, 377 (1957); P. A. M. Dirac, Proc. Roy. Soc. 
(London) A246, 326 and 333 (1958) and Phys. Rev. 114, 924 (1959); B. S. DeWitt, 
‘‘The Quantization of Geometry’’, a chapter in Gravitation: An Introduction to 
Current Research (L. Witten, ed), Wiley, New York, 1962; J. Schwinger, Phys. Rev. 
130, 1253 (1963) and 132, (1317) (1963); 

R. P. Feynman, mimeographed letter to V. F. Weisskopf dated 4 January to 11 
February 1961; Acta Physica Polonica 24, 697 (1963); Lectures on Gravitation (notes 
mimeographed by F. B. Morinigo and W. G. Wagner, California Institute of Tech- 
nology, 1963); report in Proceedings of the 1962 Warsaw Conference on the Theory of 
Gravitation (PWN-Editions Scientifiques de Pologne, Warszawa, 1964); S. N. Gupta, 
report in Recent Developments in General Relativity, Pergamon, New York, 1962; 
S. Mandelstam, Proc. Roy. Soc. (London) A270, 346 (1962) and Annals of Physics 19, 
25 (1962); J. L. Anderson in Proceedings of the 1962 Eastern Theoretical Conference 
(M. E. Rose, ed.), Gordon and Breach, New York, 1963, p. 387; I. B. Khriplovich, 
‘*Gravitation and Finite Renormalization in Quantum Electrodynamics” (mimeo- 
graphed report, Siberian Section Academy of Science, U.S.S.R., Novosibirsk, 1965); 
H. Leutwyler, Phys. Rev. 134, B1155 (1964); B. S. DeWitt, ‘‘ Dynamical Theory of 
Groups and Fields,”’ in Relativity Groups and Topology (C. DeWitt and B. DeWitt, 
eds.), Gordon and Breach, New York, 1964; S. Weinberg, Phys. Rev. 135, B1049 
(1964), 138, B988 (1965), and 140, B516 (1965); M. A. Markov, Progr. Theor. Phys., 
Yukawa Suppl., 1965, p. 85. 


2. P. W. Higgs, Phys. Rev. Letters 1, 373 (1958) and 3, 66 (1959); R. Arnowitt, 
S. Deser, and C. W. Misner, a series of papers summarized in ‘‘ The Dynamics of 
General Relativity’ in Gravitation: An Introduction to Current Research (L. Witten, 
ed.), Wiley, New York, 1962; A. Peres, Nuovo Cimento 26, 53 (1962); R. F. Baierlein, 
D. H. Sharp, and J. A. Wheeler, Phys. Rev. 126, 1864 (1962); cf. also the Princeton 
A. B. Senior Thesis of D. H. Sharp, May 1960 (unpublished); J. A. Wheeler, 
Geometrodynamics, Academic Press, New York, 1962, cited hereafter as GMD, and 
‘*Geometrodynamics and the Issue of the Final State,’ cited hereafter as GMD & 
IFS, a chapter in Relativity, Groups and Topology (C. DeWitt and B. DeWitt, eds.), 
Gordon and Breach, New York, 1964; B. DeWitt, Phys. Rev. 160, 1113 (1967), 
162, 1195 (1967) and 162, 1239 (1967), together cited hereafter as QTG. 


3. J. A. Wheeler, GMD & IFS; B. DeWitt, QTG. 


4. Michael D. Stern, Jnvestigations of the Topology of Superspace, Princeton 
A. B. Senior Thesis, May 1967 (unpublished) and Proc. Natl. Acad. Sci. U.S.A. 
(submitted for publication). 


5. R. Penrose, An Analysis of the Structure of Spacetime, Adams Prize Essay 
(mimeographed for limited distribution, Princeton University, Princeton, New 
Jersey, December 1966), gives a beautiful procedure to prescribe on the past light 
cone exactly enough geometrical information to determine out of Einstein’s field 
equations the complete 4-geometry everywhere within the past light cone. The 
treatment is given only in the analytic case (in which case the difference between 
‘‘inside”’ and ‘‘ outside”’ the light cone does not make itself felt) but from general 


Superspace and Quantum Geometrodymanics 297 


considerations one must expect in the nonanalytic case that the data in question 
determine the 4-geometry only within the past light cone. When the light cone points 
into the future instead of the past, similar considerations of course apply, obtained 
by the interchange of the words ‘‘ past”’ and ‘‘ future” in what is said in the text. 
Problems arise with such formulations of the initial value problem ‘‘on the light 
cone”’ when the propagation proceeds far in a space of variable curvature. Then 
the light cone develops more than one sheet. Compare the several claps of thunder 
often heard from a single localized explosion! 


6. For a discussion of the ‘‘observer’’ as a ‘“‘collector of printout” see for 
example E. F. Taylor and J. A. Wheeler, Spacetime Physics, W. H. Freeman, 
San Francisco, 1966. 


7. For the idea of a general spacelike hypersurface as the manifold on 
which the magnitudes of quantum field theory are to be measured, see especially 
S. Tomonaga, Progr. Theor. Phys. 1, 34 (1946) and J. Schwinger, Phys. Rev. 74, 
1449 (1948). 


8. J. A. Wheeler, GMD & IFS. 
9. B. DeWitt, QTG. 


10. This way of writing the conditions for constructive interference is symbolic 
only. In actuality the Hamilton-Jacobi function S depends not only upon 
the ‘°G, but upon an infinity of parameters that distinguish one solution of the 
Hamilton—Jacobi equation from another. Thus, in a problem with one degree of 
freedom we write S = So(x, E) + 6(E) and ina problem with n degrees of freedom 
S = So(x1,..-, Xnj %1,++., &n) + O(a1,..., &,). In geometrodynamics there are 
two degrees of freedom per space point, the magnitudes associated with which may 
be designated by « and 8. Thus the infinitude of freely disposable parameters may 
be indicated by two freely disposable functions, «(u, v, w) and B(u, v, w). The oo% 
points are given by the oo? possible choices of u,v, and w. The u, v, w manifold 
may be, but is not required to be, the same as the manifold x, y, z of points in the 
3-geometry (‘‘ alternative choices of parametrization of Hamilton—Jacobi function ’’). 
In any case we write S as a functional of « and 8; thus, S= So(°Y; a(u, v, w), 
Bu, v, w)) + d(a(u, v, w), B(u, v, w)). Then the conditions of constructive interfer- 
ence become statements about functional derivatives; thus, 


and 


This is an explicit form of the symbolic Eq. (11) of the text. Other ways of writing 
the equations of constructive interference also exist. 

11. Ulrich Gerlach, Bull. Am. Phys. Soc. for the Washington meeting of 
April, 1966, paper DE7, p. 340. 

12. A. Peres, Nuovo Cimento 26, 53 (1962). Here the unit of length is 
(1677)! /2L* = (16ahG/c?*)'!? 

13. See [1] and [2]. 


298 JOHN ARCHIBALD WHEELER 


14. R. F. Baierlein, D. H. Sharp, and J. A. Wheeler, Phys. Rev. 126, 1864 
(1962). 

15. The term ‘‘dimensionality”’’ can be translated as the requirement of 
‘‘imbeddability’’ of all the “°9’s in a ‘GY, a requirement that would seem the 
natural starting point for a derivation of the Einstein-Hamilton—Jacobi equation 
straight from first principles (see Problem 1). 


16. C. W. Misner, Rev. Mod. Phys. 29, 497 (1957); see also H. Leutwyler, 
Phys. Rev. 134, B1155 (1964) and B. S. DeWitt, QTG. 

17. M. Planck, Sitzungsber. Preussische Akad. Wiss. Berlin, Math.-Phys. 
Klasse, 1899, p. 440; J. A. Wheeler, GMD. 


18. For a review of the subject of gravitational collapse, see for example 
B. K. Harrison, K. Thorne, M. Wakano, and J. A. Wheeler, Gravitation Theory and 
Gravitational Collapse, Univ. of Chicago Press, 1965; also A. G. Doroschkevich, 
Ya. B. Zel’dovich, and I. D. Novikov, J. Expt/. Theor. Phys. 49, 170 (1965), English 
translation in Soviet Physics JETP 22, 122 (1966) and Ya. B. Zel’dovich and 
I. D. Novikov, Usp. Fiz. Nauk 84, 377 (1964) and 86, 447 (1965), English translations 
in Soviet Physics Usp. 7, 763 (1965) and 8, 522 (1965). 


19. For a presentation of the quantum electrodynamical calculation of the 
major part of the Lamb shift of hydrogen from this point of view, see T. A. Welton, 
Phys. Rev. 74, 1157 (1948) and F. J. Dyson, Advanced Quantum Mechanics, Cornell 
Univ., Ithaca, 1954 (mimeographed), p. 54. 


20. J. A. Wheeler, GMD. 


21. J. A. Wheeler, GMD and GMD & IFS; B. S. DeWitt, QTG and ‘‘ The 
Quantization of Geometry” in Gravitation: An Introduction to Current Research 
(L. Witten, ed.), Wiley, New York, 1962, p. 342 ff. 


22. In GMD. 


23. For an elementary discussion of the identity between the tide-producing 
component of the gravitational force and the Riemann curvature, see for example 
E. F. Taylor and J. A. Wheeler, Spacetime Physics, Freeman, San Francisco, 1966. 


24. E. P. Wigner, ‘‘ The Unreasonable Effectiveness of Mathematics in the 
Natural Sciences,’ in his book Symmetries and Reflections, Indiana Univ. Press, 
Bloomington, 1967; reprinted from Comm. Pure App. Math. 13, No. 1 (February, 
1960). 


25. Reference 17. 


26. A. S. Eddington, Relativity Theory of Protons and Electrons, Cambridge 
Univ. Press, 1936, and Fundamental Theory, Cambridge Univ. Press, 1946; also 
Proc. Camb. Phil. Soc. 27, 15 (1931). 


27. P. A. M. Dirac, Nature 139, 323 (1937), Proc. Roy. Soc. (London) A165, 
199 (1938). 

28. P. Jordan, Schwerkraft und Weltall, Vieweg und Sohn, Braunschweig, 
1955, and Z. Physik 157, 112 (1959). 


29. R. H. Dicke, Science 129, 3349 (1959) and The Theoretical Significance 
of Experimental Relativity, Gordon and Breach, New York, 1964, p. 72. 


Superspace and Quantum Geometrodynamics 299 


30. S. Hayakawa, Progr. Theor. Phys. 33, 538 (1965) and Progr. Theor. Phys. 
Suppl. p. 532 (1965). 


31. It is permissible to take at full force the argument of Eddington, Dirac, 
Jordan, Dicke, and Hayakawa, that a physical correlation exists between 102°, 10*° 
and 10°°, without accepting the suggestion sometimes made in the same context, 
that the physical constants may “‘change with time.” There is increasing observa- 
tional evidence against such changes, and no incontrovertible evidence for them has 
ever been found. Among the relevant observations one can cite as examples R. 
H. Dicke, Nature 183, 170 (1959) and Nature 192, 440 (1961); also R. H. Dicke and 
P. J. E. Peebles, J. Geophys. Res. 67, 10 and 4063 (1962) and Phys. Rev. 128, 5and 2006 
(1962), showing no detectable change with time in the relative rates of selected pro- 
cesses of radioactive decay. To search for changes with time in the reciprocal fine 
structure constant, «~' = fc/e? = 137.03, is simple in principle. One has only to 
compare the wavelength of the 21-cm line of hydrogen (red shifted because it was 
given out by a rapidly receding galaxy, far away and long ago) with the wavelength 
of a line in the optical spectrum (which has undergone the same red shift). The 
ratio R of the two wavelengths is «~' multiplied by a known function of the atomic 
number of the source (an integer) and of the relevant quantum numbers (also 
integers): 


R= .«~' times function of integers 


The value of R for a source 1.4 x 10° light-years away (recession rate B = v/c = 0.1) 
should differ by several percent from the value of R for a laboratory source if any of 
the suggestions are correct that physical constants might change in proportion to the 
time (or some significant power of the time) measured from the start of the expansion 
of the universe. The writer is indebted to the kindness of Professor R. Minkowski 
of Berkeley for the following (July 28, 1967) summary of the observational situa- 
tion: (1) Hopes have been dashed to observe the 2l-cm line in the spectrum of 
galaxies anywhere near as far away as would correspond to a recession velocity of 
8B =0.1. (2) Observations have been made on 30 nearer objects by Dieter, Epstein, 
Lilley, and Roberts, Astrophys. J. 67, 270 (1962) as supplemented by Roberts, 
Astrophys. J. 142, 148 (1965). The red shift is the same for the 21-cm line and for 
the optical lines within the limits of error of the observations. It is difficult to 
evaluate the accuracy because individual motions amount to as much as 20-30% of 
the average recession velocity of v = 1600 km/sec (8 = 0.005). It is probably safe 
to say that any change in «~' must be less than a few percent to be compatible with 
the observations. However, a change anyway smaller than this limit would be 
expected on almost any of the varied theories of the change of «~' with time, since 
the time lapse in this case is only one two-hundredth of the Hubble time. (3) In- 
stead of comparing the wavelength of the 2l-cm line with the wavelength of an 
optical transition, one can measure the fine structure separation of a related pair of 
lines in the optical spectrum itself. The fractional splitting AA/A should be inde- 
pendent of the red shift of the source if «~' is constant. From the observations of 
R. Minkowski, Astrophys. J. 123, 373 (1956), on Cygnus A (v = 16830 km/sec, 
B = 0.056: W. Baade and R. Minkowski, Astrophys. J. 119, 206 (1954)), it is again 
probably safe to say that any change in the fine structure constant must be less than 


300 JOHN ARCHIBALD WHEELER 


a few percent. Bahcall, Sargent, and Schmidt, Ap. J. Lert, 149, 11 (1967) give 
| da/a«| < 0.05 for z = 2. 

32. The quark, so useful in doing bookkeeping on the beautiful regularities 
of elementary particle physics (summarized, for example, in M. Gell-Mann and 
Y. Ne’eman, The Eightfold Way, Benjamin, New York, 1964 and F. Dyson, 
Symmetry Groups in Nuclear and Particle Physics, Benjamin, New York, 1966) has 
sometimes been taken much more seriously, as if it were an actual ‘‘ primordial 
building block’ of matter. That view may or may not be correct. If it is, and if 
one still continues to take quantum geometrodynamics as the only available indica- 
tor of what goes on at very small distances, then it would still seem reasonable to 
expect that one must have some perspective on what happens at 10~-°3cm before one 
can find the rationale of quarks and particles. That there is no such thing as a 
quark in the literal sense is, however, a point of view accepted by many investigators, 
and stressed especially by Heisenberg and Diirr: W. Heisenberg, /ntroduction to the 
Unified Field Theory of Elementary Particles, Wiley, New York, 1966 and H. P. Diirr, 
““On the non-linear spinor theory of elementary particles,” Acta Physica Austriaca, 
Suppl. III, 1966. Diirr has made the same point even more vividly (kind personal 
communication of June, 1967) by considering in effect what one would conclude out 
of the first several dozen atomic energy levels of an atom such as, for example, 
carbon or iron if one had (1) good measurements of the energies and transition 
probabilities, and (2) today’s aptitude for searching for symmetries, but (3) not the 
slightest idea of the actual internal machinery of an atom. He shows how groups of 
high symmetry will make their appearance. His discussion leads one to ask whether 
the innocent investigator will not conclude that the atom is made out of quarks! 

33. The strongest statement easily available against taking general relativity 
seriously at small distances appears to be that made by Robert Oppenheimer in his 
article ‘On Albert Einstein”? (New York Review, March 17, 1966, pp. 4, 5): ‘‘He 
also worked with a very ambitious program, to combine the understanding of 
electricity and gravitation in such a way as to explain what he regarded as the sem- 
blance—the illusion—of discreteness, of particles in nature. I think that it was 
clear then, and believe it to be obviously clear today, that the things that this theory 
worked with were too meager, left out too much that was known to physicists but 
had not been known much in Einstein’s student days. Thus it looked like a hope- 
lessly limited and historically rather accidentally conditioned approach.”’ 


34. For further discussion of the rationale of changes in topology see 
Problem 2. 


35. GMD. 


36. To take it as self-evident that space is Euclidean in character at small 
distances became impossible after Riemann. His Gottingen inaugural lecture of 
June 10, 1854 pointed out that space can be highly rippled at submicroscopic dis- 
tances and yet look smooth to all ordinary means of observation: ‘‘ Uber die 
Hypothesen welche der Geometrie zugrunde liegen”’ in his Gesammelte Mathema- 
tische Werke (H. Weber, ed.), 2nd ed., reprinted by Dover, New York, 1953, also ina 
translation in Nature 8, 14 (1873), by W. K. Clifford. Clifford himself went further 
in his lecture before the Cambridge Philosophical Society February 21, 1870, ‘‘On 
the Space-Theory of Matter,”’ reprinted in his Mathematical Papers (R. Tucker, ed.), 


Superspace and Quantum Geometrodynamics 301 


London, 1882, also in his Lectures and Essays (L. Stephen and F. Pollock, eds.), 
Vol. 1, London, 1879. He proposed to consider a particle as made up of nothing 
but curved empty space, differing from the surrounding space precisely in this 
localized curvature—and perhaps also in its connectivity or local topology. In Was 
ist Materie (Springer, Berlin, 1924, esp. pp. 57, 58) Hermann Weyl again pointed out 
that space may be multiply connected in the small, and consequently: ‘‘ The 
argument that the charge of the electron must be spread over a finite region, because 
otherwise it would possess infinite inertial mass, has thus lost its force. One cannot 
at all say, here is charge, but only, this closed surface encloses charge.’’? He went on 
to comment that the enormous value of the ratio of electric to gravitation forces 
‘* seems to indicate that the total number of electrons in the universe is important for 
the constitution of the individual electron.”’ Albert Einstein and Nathan Rosen, 
Phys. Rev. 48, 73 (1935), proposed the concept of two nearly Euclidean spaces, 
connected here and there by thin bridges or tubes, through which electric lines of 
force thread, to give the appearance of charges of variegated signs in the ‘‘ upper ”’ 
space and corresponding charges of the opposite sign in the ‘‘lower’’ space. 
J. A. Wheeler, Phys. Rev. 97, 511 (1955), reprinted in GMD, proposed instead the 
concept of a tube or handle or ‘‘ wormhole”’ reaching between two different localities 
in one and the same Euclidean space. It is an automatic consequence of this 
picture that the universe should contain equal amounts of positive and negative 
electricity. It is another consequence, proved by Misner in 1957 straight from 
Maxwell’s equations for empty space, that the charge, or flux of lines through the 
wormhole, must stay constant withtime. Theproof,C.W. Misner and J. A. Wheeler, 
Annals of Physics 2, 525 (1957), reprinted in GMD, holds no matter how tortuously 
the lines of force may be twisted, no matter how wanting in symmetry the geometry of 
the wormhole may be, and no matter how violently the field and the geometry may 
subsequently change with time. Misner also showed here the beautiful ties that 
exist between the Maxwell theory in a multiply connected empty space and the 
mathematics of differential forms and homology groups. In his analysis the field 
and the geometry were assumed to evolve deterministically in time, in accordance 
with the classical equations of electrodynamics and geometrodynamics. Reasons 
out of fluctuation theory to consider ‘* wormholes”’ a property, not of particles, but 
of all space, were first given by J. A. Wheeler in Annals of Physics 2, 604 (1957), 
expanded in GMD. 

37. No one has pointed out a more direct tie between the energy of vacuum 
fluctuations and macroscopic physics than H. B. G. Casimir, Proc. Nederland Akad. 
Wetenschappen, Amsterdam 60, 793 (1948), who predicted a force between two 
parallel metal plates. No attempt is made here to cite the extensive literature that 
verifies the existence and the predicted magnitude of this force. The same kind of 
fluctuations that are verified by this force at macroscopic distances are also checked 
at distances ~ 10~-'? cm by the Lamb shift, the most impressive single development 
in quantum electrodynamics in the post-World War II] period (Fig. 4 and [19)]). 

38. D. R. Brill and J. B. Hartle, Phys. Rev. 135, B271 (1964). 

39. For a historical survey that treats chemistry and atomic physics as the two 
parts of a single development, see for example: W. G. Palmer, A History of the Con- 
cept of Valency to 1930, Cambridge Univ. Press, 1965, and especially J. J. Lagowski, 
The Chemical Bond, Houghton Mifflin, Boston, 1966. 


302 JOHN ARCHIBALD WHEELER 


40. On these regularities see for example the books cited in [32]. 
41. J. Bardeen, L. N. Cooper, and J. R. Schrieffer, Phys. Rev. 108, 1175 (1957). 
42. The three issues listed here are taken up in more detail in the Appendix. 


43. A. Einstein, in P. A. Schilpp, ed., A/bert Einstein: Philosopher Scientist 
(Library of Living Philosophers, Evanston, Illinois, 1949, p. 81) remarks: ‘“‘If one 
had the field-equation of the total field, one would be compelled to demand that the 
particles themselves would everywhere be describable as singularity-free solutions of 
the completed field-equations. Only then would the general theory of relativity be a 
complete theory.” 


44. E. Cartan, Lecons sur la géométrie des espaces de Riemann, Gauthier- 
Villars, Paris, 2nd ed., 1959, Chapter 8. 


45. J. A. Wheeler, Chapter 4 on Cartan’s geometrical interpretation of 
Einstein’s field equations in Gravitation and Relativity (H. Y. Chiu and W. F. Hoff- 
mann, eds.), Benjamin, New York, 1964. 


46. Arguments that the propagator appropriate for any particle of spin two 
and mass zero necessarily has a leading term of the form 


k~*(ghig’® + ghtg’* — gttgh’) 


are given by S. Weinberg, Phys. Rev. 138, B988 (1965); also in expanded form in his 
contribution to S§. Deser and K. W. Ford, eds., Brandeis Summer Institute in 
Theoretical Physics, 1964, Vol. 2, Lectures on Particle and Field Theory, Prentice- 
Hall, Englewood Cliffs, New Jersey, 1965. 


47. See for example W. Pauli in ‘‘ Die allgemeinen Prinzipien der Wellen- 
mechanik”’ in Handbuch der Physik (Geiger and Scheel, eds.), Springer, Berlin, 
1933, Vol. 24, part 1; reprinted in revised form in the new Handbuch der Physik 
(S. Fliigge, ed.), Springer, Berlin, 1958, Vol. 5, part 1. 


48. See the discussion of the problem of factor ordering in B. DeWitt, QTG; 
also the references to earlier discussions of this issue cited by him there. 


49. Fora determination of the wave equation from (1) the principle of Lorentz 
covariance and (2) a selection of one or another set of spin quantum numbers see 
E. P. Wigner, Ann. of Math. 40, 149 (1939) and V. Bargmann and E. P. Wigner, 
Proc. Natl. Acad. Sci. U.S.A. 34, 211 (1946), both reprinted in the collection 
Symmetry Groups in Nuclear and Particle Physics (F. J. Dyson, ed., Benjamin, New 
York, 1966. 


50. For an analysis of the metric of superspace see B. DeWitt, QTG [2], and 
other work cited by DeWitt; see also S. Weinberg [46], for the propagator in the 
linear theory of gravitation in the de Donder gauge, which mysteriously has the same 
form as the metric in superspace (but indices 0, 1, 2, 3 as compared to 1, 2, 3). 


51. For the proof that the topology of space cannot change within the context 
of classical geometrodynamics, see R. P. Geroch, J. Math. Phys. 8, 782 (1967). 


52. For a model of a closed universe with the topology S3 put together out 
of 720 identical pieces each endowed with the Schwarzschild geometry (‘‘ lattice 
universe’) see R. W. Lindquist and J. A. Wheeler, Rev. Mod. Phys. 29, 432 (1957) 
and further treatment in GMD & IFS, pp. 370-379. 


Superspace and Quantum Geometrodynamics 303 


53. A. Einstein, end of chapter dealing with Mach’ principle in The Meaning 
of Relativity, Princeton Univ. Press, Princeton, New Jersey, 3rd ed., 1950. 


54. See for example some of the solutions of Einstein’s equations given by 
B. K. Harrison, Phys. Rev. 116, 1285 (1959) and his fuller Princeton University 
Ph.D. thesis, Exact Three-Variable Solutions of the Field Equations of General 
Relativity, 1959 (unpublished). 


55. Here it is assumed that the conjecture of H. Poincaré is correct, that every 
simply connected compact differentiable three-dimensional manifold is homeo- 
morphic to the three-sphere. See C. D. Papakyriakopoulos, ‘‘The theory of 
differentiable manifolds since 1950,” Proc. Intern. Congr. Mathematicians, 1958 
(Cambridge Univ. Press, 1960), pp. 433-440; also J. Milnor, Topology from the 
Differentiable Viewpoint, Univ. of Virginia Press, Charlottesville, 1965, and the 
bibliography cited by Milnor; also J. Derwent, ‘‘ Handle decomposition of mani- 
folds,” J. Math. Mech. 15, 329 (1966). 


56. Particularly to be emphasized is the distinction between ‘‘ asymptotically 
flat’? as that concept is so often understood in the classical context of a 4-geometry, 
and the concept of flatness as it is applied to a 3-geometry in the context of Hamil- 
ton-Jacobi theory or quantum geometrodynamics. No example illustrates this 
distinction more clearly than the Schwarzschild geometry. There the rate of 
approach of the 4-geometry to flatness at infinity determines always a unique value 
for the mass of the center of attraction; but the analogous calculation for a space- 
like 3-geometry slicing through the 4-geometry gives quite different values for the 
apparent mass, depending upon the choice of slice. Thus, in the 4-geometry 


ds? = — (1 — 2m/r)dt? + (1 — 2m/r)~' dr? -+ r?(d6? + sin? 6 dd?) 
take at large distances the spacelike slice 


f[=To + (8ar)'/2 
so that 


On this slice one finds a 3-geometry in which the coefficient of dr?, also at large 
distances, is 
2(m — a) 


r 


1p 


The dependence of the ‘‘ effective mass’’ (777 — «) upon the choice of slice, through 
the parameter «, suggests some of the many dangers that seem to Jurk in the concept 
of ‘‘ asymptotic flatness’’ as applied to fhree-geometries. 


57. In the Taub universe the effective mass-energy arises entirely from excita- 
tion of that mode of gravitational radiation which has the longest wavelength capable 
of fitting into this universe. For the metric of this model see A. Taub, Ann. of Math. 
53, 472 (1959) and C. W. Misner, J. Math. Phys. 4, 924 (1963). 


58. To say that a particle is ‘* pictured in terms of space resonating from one 
topology to another’”’ means more precisely that it is ‘“* pictured as a geometrodynam- 


304 JOHN ARCHIBALD WHEELER 


ical exciton—a state of excitation in which space resonates from one topology and 
geometry to another according to a probability amplitude function slightly different 
from, and orthogonal to, the probability amplitude function J(‘’Y) that describes 
the vacuum.”’ 

59. For a systematic development of ‘‘already unified field theory” see 
C. W. Misner and J. A. Wheeler, Annals of Physics 2, 525 (1957) (reprinted in GMD) 
where also reference is made to the earlier work of G. Y. Rainich. 


60. In ‘‘ already unified field theory”? the electromagnetic field tensor is 
expressed, in accordance with Einstein’s field equations, in terms of the ‘‘ Maxwell 
square root’”’ of the Ricci curvature tensor and its dual 


Fy = C R!!? iv cos « + gal Gal R'!/2 si) Fe sin od 


The change of the ‘‘ complexion ”’ « of the electromagnetic field from place to place is 
fully determined by Maxwell’s equations in places where there is a field. However, 
consider a spacelike initial value hypersurface. On this hypersurface consider two 
regions, I, Il, endowed with field and separated by a region III free of field. Within 
each region individually the relative complexion is well determined, which 1s all 
that matters momentarily for the electrodynamics. However, the complexion of 
region II relative to region I can never be found from purely geometrical measure- 
ments limited to this initial spacelike hypersurface. Moreover, this relative com- 
plexion is all important for the dynamic development of the electromagnetic field at 
those later points in space-time that can be reached by disturbances both from I and 
from II. In this sense the initial value problem of already unified field theory does 
not lend itself to purely geometrical formulation. For more on this topic see the 
chapter by L. Witten in the book of which he is also the editor, Gravitation: An 
Introduction to Current Research, Wiley, New York, 1962. 


61. The divergence condition is well known to follow from the invariance of 
the Hamilton-Jacobi function with respect to the gauge transformation, A?*” = A, 


+ 0A/ex'; thus, 
6S 
pane — peewee 3 
0= s=|(57] 8A, dx 


-|(© ” d>x 
=—|€Adx 


The vanishing of this expression for arbitrary A gives the desired relation. It should 
be emphasized that the quantity ©! as employed here is not a contravariant vector, 
but (g)'/? times a contravariant vector (“‘ vector density”’). Were the contravariant 
vector itself employed, the divergence relation would have to be expressed in terms of 
covariant derivatives rather than ordinary derivatives, complicating the derivation 
in the text. 


62. Here the symbol (© x B), stands for the covariant vector density ©'B,,. 


Superspace and Quantum Geometrodynamics 305 


63. As Hermann Weyl emphasized long ago, Math. Z. 23, 271 (1925), ‘‘ In den 
geometrischen und physikalischen Anwendungen zeigte sich stets, dass eine Grossart 
nicht allein durch Angabe der Tensorstufe, sondern durch Symmetrie bedingungen 
charakterisiert ist.’ In other words, every physical quantity is represented by an 
irreducible tensorial quantity; that is to say, by what S. S. Chern terms “‘ a geometri- 
cal object.”” Weyl conceived of these geometrical entities as local. However, it is a 
natural extension of his line of thought to speak of a functional S or & that depends 
globally upon a 3-geometry, and upon a 2-form imbedded in that 3-geometry. 


64. GMD [2], p. 88. 


65. John Milnor, ‘‘A survey of cobordism theory,”’ L’enseignement mathéma- 
tique 8, 16 (1962); ‘“‘Spin structures on manifolds,” ibid. 9, 198 (1963); ‘‘On the 
Stiefel-Whitney numbers of complex manifolds and of spin manifolds,’ Topology 3, 
223 (1965); *‘ Remarks concerning spin manifolds” in S. S. Cairns, ed., Differential 
and Combinatorial Topology, Princeton Univ. Press, Princeton, New Jersey, 1965, 
p. 55; Andre Lichnerowicz, Compt. Rend. Acad. Sci. Paris 252, 3742 (1961), 253, 940 
(1961), and 253, 983 (1961), and summary of these results in the third part of the 
chapter by Lichnerowicz, ‘‘ Propagateurs, Commutateurs et Anticommutateurs en 
Relativité Générale,” in C. and B. DeWitt, eds., Relativity, Groups and Topology, 
Gordon and Breach, New York, 1964; D. W. Anderson, E. H. Brown, Jr., and 
F. P. Peterson, ‘‘Spin cobordism,”” Bull. Am. Math. Soc. 72, 256 (1966); ** SU-cobor- 
dism, KO-characteristic numbers and the Kervaire invariant,” Ann. of Math. 83, 54 
(1966); W. C. Hsiang and B. J. Sanderson, ‘‘ Twist-spinning spheres in spheres,” 
Illinois J. Math. 9, 651 (1965). Appreciation is expressed to John Milnor, Roger 
Penrose, and Robert Geroch for discussions clarifying the concept of spin structure. 


66. Take a belt. Stretch it out flat and taut. Keeping the left-hand end A 
fixed in the left hand, twist the right-hand end Bthrough720°. Maintaining Aand B 
all the time parallel to their present orientations, move B in a complete circle about 
A (releasing for an instant one’s hold on B). The belt straightens out. Not 
so when there is only a 360° twist in it. The belt is relevant to the cube, the room, 
and the eight elastic strings. Before the cube is rotated at all, it can be pulled out 
through a window to some distance from the room. The eight elastic strings then 
take on the configuration of the belt. The distinction between 360° and 720° 
rotation for the belt applies equally to the “‘ pseudo-belt ’’ made up of the eight strings. 
Another way of seeing that a 720° rotation restores the orientation entanglement rela- 
tion between the cube and its surroundings (picture of one cone rolling on another) 
is presented by R. Penrose and W. Rindler in a preprint of an appendix to a book 
that they have in preparation. Appreciation is expressed to Professor Penrose for 
the privilege of seeing this preprint. 

67. The possibility has been suggested that one may eventually be able to 
detect what is here called the ‘‘orientation entanglement relation’’ between an 
object and its surroundings by measuring the contact potential between one metallic 
object (subject to rotation) and another (held fixed): Y. Aharonov and L. Susskind, 
Phys. Rev. 158, 1237 (1967). 

68. Cf. section on orientability, Problem 2, Level 1. 

69. It is not new to abstractify geometry. Einstein’s curved space-time was 
in the beginning nothing if it was not a home for geodesics. How else, one asked, 


306 JOHN ARCHIBALD WHEELER 


could he predict a planetary motion. Later Einstein, Grommer, Infeld, and 
Hoffman threw out the geodesics. The field equations themselves, they showed, 
predict the evolution of geometry with time, and hence the motion of concentrations 
of mass-energy. 

70. For a history of the concept of spin, see W. Pauli’s 1945 Nobel prize 
lecture, Exclusion Principle and Quantum Mechanics, Editions Grisson, Neuchatel, 
1947, also the relevant discussion in M. Fierz and V. F. Weisskopf, Theoretical 
Physics in the Twentieth Century: A Memorial Volume to Wolfgang Pauli, \nter- 
science, New York, 1960. 


71. See for example C. Kittel, Introduction to Solid-State Physics, 3rd ed., 
Wiley, New York, 1966. 


72. The torus can be converted into a single sheet by two cuts, and can then be 
conceived as laid out on the complex plane, with one corner at the origin. One 
adjacent corner can be identified arbitrarily with the number 1 + Oi, by appropriate 
choice of scale (‘‘ conformal transformation’’). The location of the other adjacent 
corner T = 7; + it2 iS then completely determined. Also completely determined is 
the concomitant 2-geometry, modulo the group of conformal transformations. The 
quantity t can be identified with the complex parameter mentioned in the text. 


73. Fora survey, with many references to the literature, see H. E. Rauch, “‘A 
transcendental view of the space of algebraic Riemann surfaces,’’ Bull. Am. Math. 
Soc. 71, 1 (1965). Appreciation is expressed to Professor Leon Ehrenpreis for 
elucidation of the subject and for this and the following reference. 


74. Lipman Bers, On the Moduli of Riemann Surfaces; lectures at the 
Forschunginstitut fiir Mathematik, Eidgendssische Technische Hochschule, 
Ziirich, 1964. Notes by L. M. and R. J. Sibner (mimeographed). 


75. For the initial value equations of classical geometrodynamics see G. Dar- 
mois, Les équations de la gravitation einsteinienne, Gauthier-Villars, Paris, 1927; 
K. Stellmacher, Math. Ann. 115, 136 (1937); A. Lichnerowicz, J. math. pure appl. 23, 
37 (1944); Helv. Phys. Acta Suppl. 4, 176 (1956); Théories relativistes de la gravitation 
et de ’électromagnétisme, Masson, Paris, 1955; Yvonne Fourés-Bruhat, Acta Math. 
88, 141 (1952); J. Rational Mech. Anal. §, 951 (1956); and the chapter by Y. Fourés 
(now Y. Choquet) in Louis Witten, ed., Gravitation: An Introduction to Current 
Research, Wiley, New York, 1962. 


76. H. Weyl, Philosophy of Mathematics and Natural Science (original German 
in 1927; translation by O. Helmer), Princeton Univ. Press, Princeton, New Jersey, 
1949, p. 91. 


77. GMD & IFS, pp. 495-499. 


78. An extensive list of problems open for further investigation is to be 
found in GMD & IFS. 


79. H. Leutwyler in Battelle Rencontres: 1967 Lectures in Mathematics and 
Physics, this volume p. 309. 


80. One is reminded in this connection of the statement of William James over 
a half a century ago, that ‘‘Actualities seem to float in a wider sea of possibilities 


Superspace and Quantum Geometrodynamics 307 


from out of which they were chosen; and somewhere, indeterminism says, such pos- 
sibilities exist, and form a part of truth.”” Appreciation is expressed to Paul Van de 
Water for this quotation. 

81. E. P. Wigner, Symmetries and Reflections, Indiana Univ. Press, Blooming- 
ton, Indiana, 1967; see J. M. Jauch, E. P. Wigner, and M. M. Yanase, Nuovo 
Cimento 48, 144 (1967), also B. DeWitt, QTG; also H. Everett III, Rev. Mod. Phys. 
29, 454 (1957); J. A. Wheeler, Rev. Mod. Phys. 29, 463 (1957) and GMD, p. 75. 


X 


The Topology of Wheeler's 
Superspace 


BRYCE S. DEWITT 


Most of the material covered in this seminar has appeared in 
Physical Review 160, 1113 (1967). 


308 


Xl 


Boundary Conditions for 
the State Functional 
in Quantum Theory of Gravity 


H. LEUTWYLER 


Some Remarks on Functional Differential Equations 310 
Weak-Field Model 312 
Boundary Condition for Weak-Field Model 313 
Explicit Solution of the Weak-Field Model 315 
Superspace 315 
Stationary Phase Approximation 316 

eferences 317 


DAN PWN 


There are various alternative formally equivalent formulations of the 
quantum theory of gravity, none of which has yet acquired anything like the 
status of rigorous existence. For an excellent review of the literature on this 
subject the reader is referred to a forthcoming paper by B. S. DeWitt [1]. 
I shall in the following adopt the functional representation of the quantum 
theory of gravity; the physical interpretation of this formulation has been 
investigated most thoroughly by J. A. Wheeler [2]. In this framework the 
state of the gravitational field 1s described by a function w that maps the space 
of all possible 3-geometries into the complex numbers. I refer the reader 
again to [1] for a detailed review of this formulation and simply quote the 
basic dynamical equation, the so-called Hamiltonian constraint: 


2 


_ 6 2 
—tl\g| (Fit Fim — 29:1 Gum) 5 +157R y =0 (1) 
ik Im 


where we have chosen a Suitable local coordinate system x', x”, x° such that 
the 3-geometry is represented by the Riemannian metric g,,(x); i, k = 1, 2, 3. 


309 


310 H. LEUTWYLER 


The quantity |g| denotes the determinant of the metric tensor and R(x) is the 
curvature invariant belonging to g;,(x). The parameter /, is a fundamental 
physical constant with the dimension of a length 


ly =(16nGhc~°)'? 


The numerical value of this constant is of the order of 107 °° cm. 

Physical state functionals yy = wig] are solutions of the functional 
differential equation (1) that are independent of the particular local co- 
ordinate system chosen to describe the geometry. 

In the following I wish to make some very simple-minded heuristic 
remarks concerning the solutions of the dynamical equation (1). 


1 SOME REMARKS ON FUNCTIONAL DIFFERENTIAL 
EQUATIONS 


In order to get some feeling for the character of the dynamical equation 
(1) let me first give a definition of the notion of functional derivative’ and 
then illustrate by means of a simple functional differential equation. 


a. Definition of Functional Derivative 


Let f(x) be an element of some normed vector space V of functions 
over the real line, for example, V = L,, and let y be a functional that maps 
V into C. The functional wy is then said to possess a functional derivative 
at fe V if the limit 


1 
Wig= es 7 {WLf + eg] — WLS} 


exists for all ge V and moreover defines a linear functional on V. The 
functional derivative dw/df(x) is the distribution corresponding to this 
linear functional, symbolically, 


v9 = {dx = 


f(x) 


g(x) 


b. Example 1 


As a very simple model for the dynamical equation (1) let us consider 
the following functional differential equation (FDE): 
67 ; 
—— + ay =0 (2) 
ST mason 4 


1T am indebted to Prof. A. Lichnerowicz for an illuminating discussion of this 
point. 


Quantum Theory of Gravity 311 


where a is aconstant. To find the solutions of this FDE let us first solve the 
analogous problem in partial differential equations. Let us replace the 
function f(x) by a countable number of arguments f, which may be viewed 
as the values of f(x) at a given set of lattice points x,. The analog of (2) 
then reads 


O° 
af, 


A complete set of solutions of this system is given by 


We = exp ia > En Sn 


+a’*p=0 


where ¢ = {e,} is an arbitrary sequence of signs ¢, = +1. This suggests an 
immediate analog in the case of the FDE (2). Let e(x) be some step function 
with values +1 such that the integral { dx e(x) f(x) exists for all fe V. Then 
the functional 


W. = exp ia | dx e(x) f(x) (3) 
possesses the functional derivatives 
OW. 
= iae(x)y, 
57x) (x)y 
a 


5 f(x)5 f(y) yo = =a e(x)e(y)W. 


In particular 67y/df(x)d6f(x) exists and satisfies the FDE (2). To every step 
function e(x) which is sufficiently well behaved to guarantee the existence of 
the integral in (3) there thus belongs a solution of the FDE (2). 


c. Example 2 

A more interesting example is provided by the FDE 
é7y 

Bf(x)6f(x) 


and we intend to show in the next section that this example has some bearing 
on the dynamical equation of the quantum theory of gravity. In this case 
the analogous system of partial differential equations reads 


dy 
fn 
To get a complete set of solutions we separate variables 


w= [| vat) 


b2f(x)y = 0 (4) 


— b*f, = 0 


312 H. LEUTWYLER 


The functions y, satisfy an ordinary differential equation whose solutions are 
Airy functions with the asymptotic behavior 


= Af~'* exp(+2/3bf°") (fo) 


A complete set of solutions may then be represented by means of products 
of the type 


= I] yr(Sn) 


where ¢ = {€,} is again a sequence of signs ¢, = +1. 

It does not seem to be a simple matter to construct the corresponding 
complete set of solutions of the FDE (4). For the present purpose we are 
only interested in a very particular solution that satisfies the boundary 
condition 


WLAf]7-0 (>) (5) 
for all fe V. The analogous solution of the system of partial differential 
equations is obviously characterized by the sequence ¢,= —1. A formal 


description of the solution defined by (5) may be obtained as follows: The 
boundary condition (5) guarantees at least formally that the functional 
Wf] admits of a functional Fourier transform 


VLf1 = | Dp exp i {dx p(x) f(s) 417) 


In Fourier space the FDE (4) reads 
Op 
dp(x) 


This first-order equation 1s easily solved with the result 


ib? + p(x)’ = 0 


VES] = fDpexp i f dx {po fix) + 5 pO? (6) 


= P 


This is of course only a formal solution since the existence of functional 
Fourier integrals of this type has not been established. 


2 WEAK-FIELD MODEL 


We now turn back to the dynamical equation of the quantum theory 
of gravity and consider the following simplified model. Let us restrict our 
attention to 3-geometries which admit of a global coordinate system such 
that the metric tensor g,,(x) is close to the Euclidean metric 


Gin(X) = On + hiy(X) lhyl <1 (7) 


Quantum Theory of Gravity 313 
In first approximation the Ricci tensor of such a geometry takes the form 
Rig? = —4(6/hye + On hy — O4hy! — 6y,h/) (8) 
and the curvature invariant is given by 
R= Ri = ("hy — 6'hy*) 


In the same approximation the dynamical equation (1) now reads 


| 6° | 
—4(6;, Otm — 26; 5pm) —— + «IG *R WW = 0 9 
| 4(0;, 0; nm) Shon, * ° (v (9) 
We are interested in solutions of this equation that are independent of the 
choice of the coordinate system. To stay within the approximation (7) we 
restrict this requirement to small deformations of the coordinate frame 


x"=x'+fi(x) ld f'| «1 
such that the transformation rule for the field /4,,(x) reads 


hin(X) = h(x) + 6; ACO) + & Ai) (10) 


The functional w[h] is coordinate independent in this approximation if its 
values at hi,(x) and at h,,(x) are the same. This invariance condition can 
be solved easily in the present approximation. It implies that w[h] does 
depent on h,,(x) only through the gauge invariant “‘ field strengths’’ Ri‘; (x) 
defined in (8). The proof of this statement is elementary; it makes use of 
the condition that the geometry is asymptotically flat,? hy(x) — 0(|x| + 0). 
The gauge transformation rule (10) means that only three of the six com- 
ponents of h;,(x) are relevant. This makes it clear that the six gauge invariant 
field strengths R‘,(x) cannot be independent. In fact the field strengths 
satisfy the three Bianchi identities 


O{Ri? — 454.2} = 0 (11) 


3 BOUNDARY CONDITION FOR WEAK-FIELD MODEL 


We are now in a position to formulate the boundary condition® to be 
imposed on physically acceptable solutions of the dynamical equation (9). 
According to the statement made in the last section we can view the state 


2 The situation is quite different in the case of a closed topology, for example, for 
geometries that are close to a flat 3-torus. I am indebted to Prof. B. S. DeWitt for useful 
comments on this point. See also [1]. 

3 For a discussion of the analogous problem in the framework of the 2 + | dimen- 
sional model for the quantum theory of gravity see [3]. 


314 H. LEUTWYLER 


vector w as a functional of the field strengths, y[h] = Y[R|]. To be physic- 
ally acceptable this functional should associate zero amplitude with singularly 
curved geometries characterized by Rj) oo. Inour present approximation, 
infinite field strength corresponds to a geometry that oscillates infinitely 
fast around an almost flat space, and it seems to be reasonable to require 
that the amplitude for the occurrence of such geometries be zero. We 
therefore impose the boundary condition 


W[h] +0 (Rix) > 00) (12) 


This boundary condition guarantees the existence of a formal Fourier 
transform of ¥[R@] with respect to Ri}. 


Wh] = [Dk expifdxk*O RR) — ALK] (13) 


This representation is of course redundant since the field strengths are not 
independent; a large family of different functionals ¢[k] leads to the same 
functional ¥[R“]. To get rid of this redundancy we introduce the new 
variable of integration p by 


kik = pi* a 16"*p/! 


The representation (13) then takes the form 
wLh] = | Dp exp i {dx pix Rt (x) - 464. R%| lp] (14) 


In this form the Bianchi identities are easily taken into account by decom- 
posing the field p'(x) into scalar, longitudinal, and transverse traceless parts 


p(x) = p(x)d™ + d'p*(x) + p(x) + prs) (15) 
where the transverse traceless part p, satisfies 
0; pit = p; = 0 


By virtue of the Bianchi identities (11) the exponent in (14) is independent of 
the longitudinal part p'(x). We can thus carry out the integration over the 
longitudinal part and thereby eliminate the redundancy in the representation 
(14). The result reads 


Wh] = [Dp, Dp, exp ifdx{ pt RY -4p,.R} — blpspil (16) 


This representation incorporates both the fact that w is independent of the 
coordinate frame and the boundary condition (12). 


Quantum Theory of Gravity 315 


4 EXPLICIT SOLUTION OF THE WEAK-FIELD MODEL 


The dynamical equation (9) of the weak-field model implies the following 
simple functional differential equation for the Fourier transform 


ey 
Op,(Xx) 


where J 1s a quadratic expression in the momenta p,, p,: 


—1 


+ J(x)¢ = 0 (17) 


1,4 | | | 
J= = [(Ap,)* — Op, On Ps — 20% ps Apt — ApiTAD i] 


The differential equation (17) fixes the dependence of the functional ¢[p,, p,] 
on the variable p, uniquely. Explicitly the solution reads 


LPs» Pi] = Plpil exp — iQLp,, pi] (18) 
lov ik ik ik 
OL, Pi] = -— | dx{(O%p, + Ap')0; 7.2 Ps — PsAPHAPLn} (19) 


The quantity ® is an arbitrary functional of the transverse traceless part p,. 
To every such functional there corresponds a solution of the weak-field model 


wLh] = | Dp,Dp, exp i | dx{pitRi.? — 4p, R'} exp — iQ[p,, psJLp] 
(20) 


5 SUPERSPACE 


It is instructive to reinterpret the solutions just obtained in the space of 
all 3-geometries, Wheeler’s superspace [2]. In the weak-field model this is 
the space of all equivalence classes of fields h,,(x), two such fields being 
equivalent if they are related through a gauge transformation of the type (10). 
In terms of the decomposition 

hi, = hed, + O;hy + Oh; + hig (21) 
analogous to (15) a point in superspace corresponds to a given pair of fields 
{h(x), hy(x)}. The fact that physical state functionals are independent of 
the coordinate frame implies that these functionals are independent of the 
longitudinal parts of h;,, that is, they are functionals on superspace wy = 
wlh,, h,|. The representation (20) of the physical state functionals then 
implies that the knowledge of W[h,, 4,] at one particular superhypersurface 
h, =h,° determines the functional completely: 


WEhg, hy] = [G(h,, hy h, hy DAL WTAY, A.) (22) 


316 H. LEUTWYLER 


The kernel G is expressible in terms of formal Fourier integrals. This 
representation of the solutions manifests the impact of the boundary con- 
dition (12) most clearly. The original dynamical equation in superspace, 
Eq. (9), is second order in the derivatives and a solution of this equation is in 
general determined only if both the value of w and its normal derivative 
6w/dh, are given at the superhypersurface h,=h,°. The boundary condi- 
tion (12), however, reduces the manifold of solutions drastically, such that 
the value of the normal derivative cannot be specified independently, but is 
determined by the value of the functional itself. 


6 STATIONARY PHASE APPROXIMATION 


As a consistency check let us finally verify that the stationary phase 
approximation to the solution (20) satisfies the Hamilton-Jacobi equation. 
Assume that the functional ® is of the form 


®Lp,] = ®oLp.] exp idl pi] 


where the real functional ®,[p,] varies smoothly with p,. The Fourier 
integral defining w[h] may then be expected to pick up significant contribu- 
tions only from a neighborhood of the point of stationary phase which 
occurs, say, at p,, p,. Inthe stationary phase approximation the functional 
W[h] then takes the form* 

WwlLh] = AexpiS (23) 


where A is some slowly varying functional and S is the value of the phase of 
the integrand at the stationary point 


S= | dx{pERW — 4p, R} — O[p,, P11 + OLPi) 


and is a functional of the field h;,(x). We are interested in the functional 
derivative of S with respect to h;,(x). To compute this quantity we observe 
that a small change in the field /;,(x) slightly shifts the stationary values 
P,,p,. This does not produce a change in the value of S, however, since S$ 
is stationary with respect to variations of p, and p,. The variation of S$ 
induced by the given variation of the field /;,(x) is therefore given by 


5S = | dx{ptdsRy? — $p,5R™} 
and from this one reads off the value of the functional derivative 


6S | | 
——— = —L{Ap* + dip. — di*Ap 24 
5h,(x) S{Apy + OD, Ps} (24) 


* The classical action is obtained from S by dividing through h = A/2z, where h is 
Planck’s constant. 


Quantum Theory of Gravity 317 


Let us now determine the stationary point in the integration over the variable 
p;. The condition reads 


6Q 


2 
6p,(x) 


+ R(x) =0 (25) 


The quantity 6Q/dp, is a quadratic expression in the variables p,, p, and it is 
straightforward to verify that it may be expressed in terms of the particular 
linear combination occurring in (24) with the result [6] 

6S 6S 


L606, = 2050). -— — — + RY = 
$1y (Oi O1m il ish 5h, se 0 (26) 


This is indeed the Hamilton-Jacobi equation of the weak field model. 


ACKNOWLEDGMENTS 


It is a pleasure to acknowledge many stimulating remarks and critical 
comments by Professors B. DeWitt, A. Lichnerowicz, and J. A. Wheeler. 


REFERENCES 


1. B. S. DeWitt, Quantum Theory of Gravity, to be published in Phys. Rev. 

2. J. A. Wheeler, Relativity, Groups, and Topology, 1963, Les Houches Lectures 
(Gordon and Breach, New York, 1964); see also ‘‘Superspace and the Nature of 
Quantum Geometrodynamics,”’ in these Proceedings. 

3. H. Leutwyler, Nuovo Cim. 42, 159, (1966). 


Xi 


The Everett-Wheeler 
Interpretation of 
Quantum Mechanics 


BRYCE S. DeWITT 


Every physical theory consists of two parts: (1) a mathematical formal- 
ism and (2) a metaphysical’ framework which ascribes physical meaning to 
the theoretical symbols. In the case of the quantum theory part (2) plays an 
exceptionally large role because the theory, in spite of its enormous practical 
success, 1S SO contrary to Intuition and so strange, even after forty years, that 
the experts themselves do not all agree today what to make of it. 

Most physicists more or less follow Bohr in maintaining that the 
formalism of quantum mechanics can make sense only if there exists a 
classical (and hence more familiar) realm in which the results of observations 
can be uniquely recorded and communicated. However, there is disagree- 
ment as to where the dividing line between “quantum” and ‘‘classical”’ is 
located and how the transition from one realm to the other should be 
described. Some, the empiricists or pragmatists, insist that it doesn’t matter, 
that such questions arise only when one attempts to ascribe reality to the 
wave function, and that they automatically resolve themselves if one fully 
adopts the philosophy of Complementarity. Others, the realists, are more 
willing to face up to the literal consequences of the quantum formalism but 
try to compromise them at the classical level with the aid of statistical en- 
sembles. A small minority of realists insists that the quantum theory 1s 
only a partial theory, the statistical aspects of which will someday follow 
from a more complete theory based on presently hidden variables. 


1 The word ‘‘ metaphysics’”’ is here to be understood as bearing the same relation to 
*‘ physics’ as “‘ metamathematics’’ bears to ‘‘ mathematics.” 


318 


Quantum Mechanics: Everett-Wheeler Interpretation 319 


Ten years ago Everett and Wheeler [1,2] proposed an entirely new 
interpretation of quantum mechanics, which accepts the literal consequences 
of the mathematical formalism without any compromises whatever. Because 
of the rather bizarre world view which this acceptance entails, the Everett- 
Wheeler proposal has thus far found little favor. It has, however, the merit 
of bringing clearly into the foreground most of the fundamental issues 
arising in quantum measurement theory, and therefore I can use as an excuse 
for describing it to you today the fact that it will be good for your general 
education! 

I have also another excuse—a poor one perhaps, but one nevertheless 
worthy of consideration. Among the lectures to which we have listened 
in these Rencontres have been several by Penrose which called attention to 
the inadequacy of classical general relativity for dealing with the phenomenon 
of gravitational collapse. It has frequently been suggested that this in- 
adequacy might be overcome by building a theory which takes quantum 
effects into account. Professor Wheeler has outlined to us some of the formal 
properties necessarily displayed by such a theory, and has given us many 
additional beautiful reasons for believing in it, having to do with the topology 
of space-time as a potential source of physical structure. According to 
this theory, space-time, in its role as a pseudo-Riemannian manifold, is not 
only the arena in which physical events take place but also, in the guise of 
the gravitational field, a dynamical Hamiltonian system subject to the laws 
of quantum mechanics. Since space-time serves as the common arena for 
both classical and quantum processes it is evident that one faces unusual 
difficulties in this theory in attempting to define Bohr’s classical realm. The 
difficulties are particularly acute in a cosmology which assumes that space 
is compact. Wheeler has shown us that in this case the state of the world 1s 
very naturally represented as a function over 3-geometries. Although the 
structure of its domain manifold (that is, the superspace of 3-geometries) is 
as yet poorly understood there can be no doubt that this function is quite 
literally a wave function for the universe. 

To anyone educated in the Copenhagen school such a function is 
meaningless, since it leaves no room for a classical realm. The external 
observer has no place to stand. In order to give meaning to a universal 
wave function one must find a radically new way of looking at quantum 
mechanics. This is what Everett and Wheeler have done. 

Since the quantum theory of space-time is still under construction I 
cannot actually use it as a background for this talk. Fortunately I don’t 
have to. The Everett-Wheeler metatheory applies equally well to ordinary 
nonrelativistic quantum mechanics, and this is the setting in which I shall 
discuss it. In fact, this is the setting in which Everett and Wheeler themselves 
discussed it and is the only setting in which it has thus far been applied. It 
could be extended in a fairly straightforward manner to quantum field 


320 BRYCE S. DEWITT 


theories satisfying the requirements of special relativity, provided one does 
not insist on too much rigor at points where the divergence difficulties of these 
theories enter.2, However, apart from the possibility that the infinite 
number of degrees of freedom possessed by relativistic fields may expedite 
the randomization of phases which occurs during measurement processes, 
such an extension would not be expected to yield any unusual new insights. 
The famous investigations of Bohr and Rosenfeld [3] showed long ago that 
the measurements one can in principle perform on quantized fields are 
analogous in every respect to the measurements one can perform on non- 
relativistic systems. In both cases accuracy is limited only by the un- 
certainty principle. 

Let us proceed then to the main topic. According to Everett and 
Wheeler the real world, or any isolated part of it we may wish for the moment 
to regard as ‘‘the world,” is faithfully mirrored (in the nonrelativistic 
approximation) by the following collection of mathematical objects: (1) a 
Hilbert space, (2) a Hamiltonian operator, and (3) a Schrodinger equation 
for vectors in the Hilbert space. Nothing else is needed. In particular, the 
existence of Bohr’s external classical world is denied. 

As to why the good Lord chose to construct the world along these 
particular mathematical lines, Everett and Wheeler are silent. The economy 
of these lines, however, is remarkable. It turns out that only one additional 
postulate is necessary to give the mathematics physical meaning. This is 
the postulate of complexity: The Hamiltonian must be sufficiently complicated 
so that it may be regarded as describing at least two (but generally many 
more) distinct dynamical systems which, under certain conditions, but not 
always, can act as if they are isolated and independent. When these con- 
ditions hold it is convenient to decompose the Hilbert space into a corre- 
sponding Cartesian product. 

One may ask whether a postulate 1s not also needed which, in effect, 
says that the “‘states”’ of the world are represented by rays in the Hilbert 
space. We shall indeed use the word “‘state”’ in just this sense. However, 
a postulate is not needed for this purpose. In a carefully constructed 
syntax words like “state,” “dynamical system,” “isolated,” ‘‘ independent,” 
which we are here using in an informal way for the sake of convenience and 
intelligibility, would be replaced by symbols subject, together with the 


2 It has been conjectured that some of these difficulties might be alleviated in a 
generally covariant theory, with the quantized gravitational field acting as a divergence 
regulator. But this is a question for the future to answer. 

3Jn relativistic quantum field theories one encounters infinite Cartesian products, 
reflecting the infinitude of the numbers of degrees of freedom. This means thatthe relevant 
vector spaces are not strict Hilbert spaces but appropriate generalizationsthereof. Because 
of the divergence difficulties of these theories the definition of ‘‘appropriate’’ has not yet 
been fully settled. 


Quantum Mechanics: Everett-Wheeler Interpretation 321 


mathematical symbols of the formalism itself, to certain formal rules of 
manipulation but empty of any a priori meaning. These words acquire 
semantic content only after one has investigated the consequences of the 
genuine postulates, riz., the statement of the mathematical content of the 
formalism and the postulate of complexity. It is, in fact, the a posteriori 
semantic content of the formalism that Everett and Wheeler wish to em- 
phasize. Without drawing on any external metaphysics or mathematics 
other than the standard rules of logic they are able to prove the following 
metatheorem: The mathematical formalism of the quantum theory is capable 
of yielding its own interpretation. 

The proof of this metatheorem is carried out in stages, with the world 
assuming an increasingly complex character at each stage. One begins by 
considering a ‘“‘ world” consisting of just two ‘dynamical systems.’ That 
is, One assumes the Hamiltonian to have a certain structure which, after the 
semantic content of the formalism has been established, will enable it to be 
recognized as capable of describing a pair of actual physical systems. One 
of these systems is conventionally called simply “‘the system,”’ and the other 
is called “‘the apparatus.”’ In discussing the “‘states’’ of these systems | 
find it convenient to employ the notation of Dirac.* Thus I will denote a 
member of an orthonormal set of basis vectors for the system by |s> and a 
member of an orthonormal set for the apparatus by |A>. Here s is an 
eigenvalue of some system observable and 4A is an eigenvalue of some appara- 
tus observable. (Additional labels which might, in a practical situation, be 
needed in order to specify the basis vectors uniquely are suppressed.) The 
Cartesian product structure of the Hilbert space of ‘‘the world’? may be 
expressed by introducing for it the basis vectors 


Is, AD = |s>|A> (1) 
In order to keep the subsequent analysis as simple as possible I shall assume 


that the eigenvalue s ranges over a discrete set while the eigenvalue A ranges 
over acontinuum. Orthonormality and completeness are then expressed by 


<s, A|s’, A") = 6,..0(A — A’) (2) 
D [I s, A><s, Aldd = | (3) 


Suppose that the state of “‘the world”’ at some initial instant is repre- 
sented by a vector of the form | 


IWo> = lW>| ®> (4) 
where |> is a vector in the system Hilbert space and |®) is a vector in the 
apparatus Hilbert space. In such a state the system and apparatus are said 


* Reference may be made to [4] for the formal definition (in terms of Hermitian 
operators) of the noun “ observable” which is used in what follows. 


322 BRYCE S. DEWITT 


to be uncorrelated. Let us assume that the role of the apparatus is to “ learn”’ 
something about the system state. System and apparatus must then be 
coupled together during a certain period, so that the state of the world will 
not retain the form (4) as time passes. The final result of the coupling will 
be described by the action of a certain unitary operator U: 


IY¥i> = U|Po> (5) 


Here it is convenient to work in the so-called interaction picture in which the 
state vector changes with time only during the coupling interval. The 
operator U is then the solution of a Schrédinger equation in which only 
that part of the Hamiltonian which refers to the coupling appears. Just one 
word of caution in regard to terminology is in order. In classical mechanics 
the word ‘‘state’’ usually refers to ‘“‘ position and velocity at a particular 
instant,” but in the interaction picture of quantum mechanics it may refer 
to one of the vectors |‘¥,> and |‘¥,> which together describe (in a quantum 
sense) the entire space-time trajectory outside of the coupling interval. It 
should perhaps also be emphasized that the existence of a coupling interval 
need not imply a time dependence on the part of the Hamiltonian.> The 
coupling interval may be incorporated into the state vector |‘¥,)> instead. 
For example, let |X,f> represent the state in which the apparatus has the 
position XY at the time ¢. Then if the function (X, ¢|®)> has the form of a 
wave packet, the coupling interval may be the period during which the wave 
packet is in a certain ‘‘ region of interaction.”’ Thus, in a scattering experi- - 
ment, where the apparatus becomes a projectile, the system becomes a target, 
and U becomes the so-called scattering operator or S-matrix, the region of 
interaction is the neighborhood of the target. 

If there were no coupling between system and apparatus the state 
vector of ‘“‘the world’? would remain permanently equal to |W,>. It is 
therefore tempting to define the ‘‘measurement”’ which the apparatus 
performs on the system as the difference between |‘¥,») and |W)». However, 
|¥,> is the state vector of “the world” and not that of the apparatus. 
Moreover, system and apparatus have thus far been treated on an equal 
footing, so that we could with equal right speak of the ‘“ measurement” 
which the system performs on the apparatus. Evidently there must be some 
separation of function, so that a distinction may be made between the two. 
This is done by assuming that their mutual coupling is of such a form as to 
yield an operator U which has the following action on the basis vectors (1): 


U|s, A>) =|s, A + gs) = |5>|A + gs) (6) 
> If the Hamiltonian were time-dependent the law of energy conservation would not 


hold, and one would be disinclined to regard system and apparatus as forming a truly 
isolated and self-contained ‘‘ world.”’ 


Quantum Mechanics: Everett-Wheeler Interpretation 323 


where g is a “coupling constant.” If the initial state of the system were |s)> 
and that of the apparatus were |A» then this coupling would be said to result 
in an ‘‘ observation,” by the apparatus, that the system observable has the 
value s. Moreover, this ‘‘ observation” or ‘‘measurement”’ would be re- 
garded as ‘“‘stored”’ in the apparatus “memory” by virtue of the permanent 
shift, from |A»> to |A + gs), in the apparatus state vector. 

It would lead us too far from our main topic to describe the Hamiltonian 
structures and couplings which, to good approximation at least, are capable 
of yielding measurements of the form (6). Suffice it to say that such struc- 
tures are not difficult to devise [5] and that, in principle, any system observable 
is capable of being measured bya suitable coupling. Whether such couplings 
can always (or even sometimes) be realized in the laboratory is irrelevant, 
being a question which concerns the practical limitations of the experimenter’s 
art and not an issue of principle. There is, however, one limitation which 
must be mentioned because it has a somewhat more fundamental character 
than the rest. In most cases the measurement will not be a practical success 
unless the apparatus is ‘‘ macroscopic,” which means that the time integral, 
over the coupling period, of its mean energy in the initial state must be large 
compared to the time integral of the coupling energy, and hence large com- 
pared to Planck’s unit of action. This is particularly true when the system 
itself contains only a few Planck units and is hence capable of being non- 
negligibly disturbed by the measurement process. Under these circumstances 
one frequently calls the system a “quantum system”’ and the apparatus a 
‘“classical system.” It deserves emphasis, however, that this is only a 
manner of speaking. The apparatus is still in a ‘“‘pure” state (that is, 
represented by a single ray in its Hilbert space), and no matter how compli- 
cated this state may be and how incomplete our knowledge of it, it does not 
represent a statistical ensemble. 

In a measurement of the form (6) the value of the system observable 
appears to remain undisturbed by the coupling. While measurements of 
this type do exist (for example, the Stern—Gerlach experiment in first approxi- 
mation), it is much more frequently the case that the observable suffers a 
change. Bohr and Rosenfeld [3] have nevertheless shown that if suitable 
compensation devices are introduced, which may be represented by certain 
second-order corrections to the coupling (see DeWitt, [5]), the apparatus is 
able to record what the value of the system observable would have been had 
there been no coupling. Given a knowledge of the explicit form of the 
coupling it is always possible to express this hypothetical value in terms of 
the actual dynamical variables. Hence the observable of which this value 
is an eigenvalue is not itself hypothetical, and no inconsistency will arise if 
we take it to be the observable to which the label s refers on the right-hand 
side of Eq. (6). We must remember, however, that its expression in terms 
of conventional dynamical variables may be very complicated, so that it 


324 BRYCE S. DEWITT 


would generally be very difficult to find a coupling which would, in effect, 
reconstruct it for us and allow us to measure it a second time. Moreover, 
there is another difficulty which affects both the apparatus and system 
observables equally. In the interaction picture it is incorrect to refer to 
‘‘position”’ or ‘‘momentum” as observables. Rather one must speak of the 
‘‘ position at a given time”’ or the “* momentum at a given time.” Similarly, 
the quantities s and A, which label the system and apparatus basis vectors, 
must be understood as involving particular instants or intervals of time in 
their intrinsic definition. Unless the observables, of which these quantities 
are the eigenvalues, commute with the Hamiltonian, neither the state |s> nor 
the state |A> can be referred to as stationary, either before or after the coupling 
interval, in spite of the fact that the vectors themselves are time-independent. 
This means that even if the coupling leaves the system observable undisturbed 
it will not be easy to repeat the observation at a later time. Moreover, it 
will become increasingly difficult, as time passes, to ‘‘ read out” the contents 
of the apparatus memory. 

Bearing all these cautionary remarks in mind we now examine the out- 
come of the measurement process when the initial state of ‘“‘the world” has 
the general form (4). Using (3), (5), and (6), we easily find 


I¥i> =D es 5>| O[s]> (7) 
c.¢ = <s| > (8) 
|®[s]> = [14 + gs>@(A) dA (9) 
@(A) = <A|®) (10) 
It is convenient to adopt the normalizations 
Wilyo=1 <@|®>=1 (11) 
which yield 
(Pol Pod = CP | P1> = | (12) 
and 
Yue=l w= lel? (13) 
[I@(A)/? da = 1 (14) 


The final state vector (7) does not represent the system observable as 
having any unique value. Rather it is a linear superposition of vectors 
|s>|®[s]>, each of which represents the system observable as having assumed 
one of its possible values and the apparatus as having observed that value. 
Everett and Wheeler have coined the expression ‘‘relative state’’ to denote 
the state of the apparatus represented by the vector |P[s]>. Relative to each 


Quantum Mechanics: Everett-Wheeler Interpretation 325 


possible state |s> of the system there is a corresponding apparatus state 
|®[s}>. For each possibility the observation will be a good one, that is, 
capable of distinguishing between adjacent values of s, provided 


AA <gAs (15) 


where As is the spacing between adjacent values and 
AA? = [(4 ~ £AY)*|0(A)|2 dA CAD = [Alocay? dA (16) 
Under these conditions we have 


<®[s]| P[s']> = 6... (17) 


That is, the wave function of the apparatus takes the form of a packet which 
is initially single but which subsequently breaks up, as a result of the coupling 
to the system, into a multitude of mutually orthogonal packets, one for each 
value of s. We note that the packet structure of ®(A) suggests again the 
utility of choosing an apparatus which is macroscopic. With a macroscopic 
apparatus it is possible to reduce to tolerable limits the packet spreading 
which inevitably occurs when the states |A» are not stationary, so that some 
degree of practical permanence may be imparted to the apparatus memory. 

We have now reached the point at which Everett and Wheeler part 
company with the rest of the physics community. According to the con- 
ventional interpretation of quantum mechanics the wave function “ collapses ”’ 
immediately after the measurement. Instead of consisting of a multitude 
of packets it reduces to a single packet, and the state vector |‘¥, > reduces to 
a corresponding element |s>|@®[s]> of the superposition (7). To which 
element of the superposition it reduces one cannot say. One instead assigns 
a probability distribution to the possible outcomes, with weights given by 
Eq. (13). 

The collapse of the wave function and the assignment of statistical 
weights do not follow from the Schrodinger equation. They are consequences 
of an external a priori metaphysics which ts allowed to intervene at this point 
and suspend the Schrédinger equation, or rather replace the boundary 
conditions on its solution by those of the collapsed wave function. The 
destruction of continuity and hence of the causal structure of quantum 
mechanics which this entails 1s repugnant to many physicists, not least to 
Everett and Wheeler. But in attempting to save the situation the latter 
differ from all others in the boldness of their solution. In brief, they throw 
the a priori metaphysics out the window and assert that the wave function 
never collapses. 

In defense of this stand it may be argued that the conventional inter- 
pretation of quantum mechanics confuses two concepts which ought really 
to be kept distinct, namely, probability as it relates to quantum mechanics 


326 BRYCE S. DE WITT 


and probability as it is understood in statistical mechanics. Quantum 
mechanics is a theory which attempts to describe in mathematical language 
a world in which chance is not a measure of our ignorance but is absolute. 
It is inevitable that it should lead to states which, like (7), undergo multiple 
fission, corresponding to the many possible outcomes of a given measure- 
ment. Such behavior is built into the formalism. However, precisely 
because quantum mechanical chance is mot a measure of our ignorance, we 
ought not to tamper with the wave function merely because we acquire new 
information as a result of a measurement. 

The obstacle to taking such a lofty view of things, of course, is that it 
forces us to believe in the ‘reality’ of all the simultaneous ‘‘ worlds” 
represented in the superposition (7), in each of which the measurement has 
yielded a different outcome. Nevertheless, this is precisely what Everett 
and Wheeler would have us believe. According to them the “ real universe ”’ 
is faithfully represented by (one might even say “isomorphic to’’) a state 
vector similar to (7) but of vastly greater complexity. This universe is 
constantly splitting into a stupendous number of branches, all resulting from 
the measurementlike interactions between its myriads of components. 

To those of us who immediately object that we do not feel ourselves 
split, Everett and Wheeler have a ready reply: To the extent that we can be 
regarded simply as automata, and hence on a par with ordinary measuring 
apparatuses, the laws of quantum mechanics do not allow us to feel the split. 
The demonstrability of this reply, which in fact constitutes one of the steps 
in the proof of the Everett-Wheeler metatheorem that we are still in the 
midst of, reveals at once both the strength and the weakness of the Everett- 
Wheeler philosophy. Although it is a beautifully self-consistent philosophy 
it Can never receive operational support in the laboratory. There is no 
experiment which can reveal the existence of the “‘ other worlds”’ in a super- 
position like (7). 

To prove this we begin by noting that although the operator of which 
the apparatus variable A is an eigenvalue has been called an ‘‘ observable,”’ 
it has so far not actually been observed. What happens if we now introduce 
a second apparatus which not only looks at the memory bank of the first 
apparatus but also carries out an independent direct check on the value of the 
system observable? If the splitting of the wave function is to be unobservable 
the results had better agree. 

For the system we again introduce the basis vectors |s>, and for the 
apparatus we introduce basis vectors |A,»> and |A,, B,>, respectively. The 
variable B, of the second apparatus is to be used to measure A,. The total 
measurement will be carried out in two steps. In the first, both apparatuses 
Observe the system variable s, by means of a unitary transition of the form: 


U,(|s> |4,>|A2, B2>) = |S> 1A; + 925) |A2 + 925, Br? (18) 


Quantum Mechanics: Everett-Wheeler Interpretation 327 


In the second step apparatus 2 reads the memory contents of apparatus |. 
The unitary transition which accomplishes this is 


U(IS>|A1> 142, Bz>) = Is) 1A1> 1|A2, Bz + 912 Ar? (19) 

Suppose now the initial state of “‘the world” has the form 
IMo> = |W [®,> [®2> (20) 

Then at the end of the first measurement it takes the form 
¥1> = Uy|%o =D esls> 1M, [5]> |®2[s]> (21) 


AY 
where 


®,[s]> = [1A + 9150 (A) dA, (4, = (AIM, (22) 
|®aEs]> = | dA, | dBy|Az + 92 5, By>®2(A2, Ba) 
®,(A,, B,) = (Az, Bz |®2> (23) 


and where standard normalization has been assumed. At the end of the 
second measurement it becomes 


IW¥2> = U2|¥i> = 2 [ eds> |A, + 915>®,(A,)|®2[5, A, + 9,5]> dA, (24) 
where 
als, Ai]> = | Az | dBz|A + 925, By + 912 A12P2(A2, By) (25) 
In order that the first measurement be good we must have 


AA, <g,As and AA, <g,As (26) 


where AA, and AA, are defined in an obvious manner. If, in addition, we 
require 


AA, <AB,/g,, and AB, <g,.9,As (27) 
then we may, to good approximation, write 
[ay [dAy | By |®,(A,)1@2*(A2, By)®2(Az, By + 91241 — (AL) 
~ | dA, | dB,|®,(A2, B,)|? =1 (28) 
from which it follows that 
I IM2> - 2 cs|s>|®,[s|®2[s, (A> + gis) || > 0 (29) 


and hence 
I¥2> = > cg ls>|®, [s}>|®2[s, (A,> + gis) (30) 


328 BRYCE S. DEWITT 


The final state vector is again revealed as a linear superposition of 
vectors, each of which represents the system observable as having assumed 
one of its possible values. Although the value varies from one element of 
the superposition to another, not only do both apparatuses within a given 
element observe the value appropriate to that element, but also, by straight- 
forward communication, they agree that the results of their observations are 
identical. In the above example apparatus 2 may be assumed to have 
“known in advance” that the ‘mean value”’ of A, was <A,» in the initial 
state of apparatus |. When, after the second measurement, apparatus 2 
‘“sees”’ that this mean value has shifted to <A,» + g,5, it then ‘“‘ knows”’ that 
apparatus | has obtained the same value for the system observable, namely, 
s, as it did. 

As another check on the unobservability of measurement-induced 
splittings of the wave function we may consider repeated measurements on 
ensembles of systems. Let |s,>, |s.>, ... be basis vectors for a sequence of 
systems, arbitrary in number, and let the basis vectors of an apparatus which 
observes these systems be denoted by |A,, A,,...>, where each variable A, 
represents the state of one cell in the apparatus memory. Let i, i,,... be 
a sequence of integers, not necessarily all distinct, and let us assume that 
the apparatus undergoes a sequence of couplings with the system ensemble, 
of which the nth consists of an observation of the i,th system. Since the 
integers i, are not necessarily all distinct it will be convenient to assume that 
the s; are eigenvalues of observables which commute with the total Hamilton- 
ian and that each measurement is of the easily repeatable nondisturbing 
type. The nth measurement will then have the following effect on the basis 
vectors: 


US; >-|85 °** |Aqs Aba ois Ags) = 8,0 |S: 
|A1,A2,-.55 An + GaSi,2°-> (31) 


If the initial state of “‘the world” has the form 


IPod = Wid [Wad ++ | (32) 
then, after an arbitrary number of measurements, it becomes 
> = +++ UZ U|Po> = . z C55. °°" 151) [52> °° IPls;, 5 S:.5---]> (33) 
where _ 
cs, = (sil Wid (34) 


|D[s;,, Si,n%2) = [ a4, | 44, **+|Ay + Si, Az + 925i, °° 2 
x O(4,, A,,...) (35) 
@(A,, A, y= (A,;, Az °** |M> (36) 


Quantum Mechanics: Everett-Wheeler Interpretation 329 


Once again the final state vector consists of a superposition of ‘‘ branches.”’ 
The important thing to note about this vector is that although within each 
branch the apparatus may have observed a given system any number of times, 
it always records the same value for the system observable, namely, the 
value appropriate to the branch in question and not that of some other 
branch. The splitting into branches is thus again unobserved. 

It is not difficult to devise increasingly complicated situations, in which, 
for example, the apparatus can make decisions by switching on various 
couplings depending on the outcome of other observations. It should be 
clear by now, however, that no inconsistencies will ever arise which would 
permit a given apparatus to be aware of more than one “world” at a time. 

We now turn to the final issue which must be settled; namely, the 
question of the coefficients c, in the superpositions (7), (21), (30), etc.° Thus 
far no a priori interpretation has been given to these coefficients. In order 
to find an interpretation let us again use the formalism of the preceding 
example. Only now let us assume that all the systems in the ensemble are 
identical and in identical states. This means 


«s|W.> =, for all i (37) 


Furthermore, let us assume that the apparatus observes each system exactly 
once, in sequence, which means ij, =n for alln. After N measurements the 
state vector of “‘the world”’ then takes the form 


|Yx> = y Col tS Is2> ++ [®[s,, 52,..-5 Sy] (38) 
S1> S250. 

It will be observed that although every system is initially in exactly the same 
state as every other, the apparatus does not generally record a sequence of 
identical values for the system observable, even within a single element of 
the superposition. Each ‘memory sequence” s,,5,,..., Sy yields a certain 
distribution of possible values for the system observable. This distribution 
is characterized by the so-called relative frequency function: 


1 N 
I(Ss 1 «++ Sw) =H oe. (39) 
n=1 
We shall need the following easily verified properties of this function: 
YD S(S5 Sy... SN)We, +++ Wey = Ws (40) 
5 l 
> Uf (S351... Sy) — We]?We, os Wey = 5, wal — ws) (41) 


where the w’s are any numbers which, taken all together, add up to unity. 


© The following discussion is essentially due to R. N. Graham. 


330 BRYCE S. DEWITT 


Let us choose for the «’s the numbers defined in Eq. (13), and let us 
introduce the function 


(5, --- Sy) = ¥, Uf (53 5)... Sy) — Ws]? (42) 


This function measures the degree to which the sequence s, --- Sy deviates 
from a random sequence with weights w,. Let ¢ be an arbitrarily small 
positive number. We shallcall the sequence s, ... Sy random if 0(s, ... Sy) < & 
and nonrandom otherwise. Suppose we remove from the superposition (38) 
all those elements for which the apparatus memory sequence is nonrandom. 
Denote the result by |‘¥,°> and define 

lyw’> = I¥> — [n> = LL Csi Csge + - 11> 152 «+ - [OL Sy) (43) 


S1,S2,... 
6(S1 ... SN)ZE 


Then if each measurement is good, so that 
N 
(P[s, ... Sy] | Pfs,’ ... Sy’)D [1 3s, S,' (44) 
[cf. Eq. (17)], we have 


<xn' | ‘Pn > = 0 (45) 
and also 
xn" | Xn > eae 3 Ws, Ws, eae > Ws, at Wsy 
S14 S25... $1 ...SN 
O(S1 ... SN)ZE 6(S1 .. SN)ZE 


I I I 
E sy; Le O(S; Sy)Ws, Wsy Ne = w.(] Ws) < Ne ( 6) 


in which use is made of (41) and (42). From this it follows that no matter 
how small we choose é we can always find an N big enough so that the norm 
of |yy°> becomes smaller than any positive number. This means that 
lim (|Py> — |Wy>) = 0 for alle > 0 (47) 
N70 
The conventional probability interpretation of quantum mechanics 
thus emerges from the formalism itself. Nonrandom memory sequences in 
the superposition (38) are “of measure zero” in the Hilbert space, in the 
limit N— oo.’ Each automaton (that is, apparatus cum memory sequence) 


7 The number of systems with which we work and the number of cells in the apparatus 
memory can, of course, never be infinite. However, all that we require is that the number 
of systems in the above derivation, although finite, be arbitrarily large. A given memory 
sequence will, if sufficiently long, approximate a random sequence with overwhelming 
probability. 

It is worth noting that nowhere in these arguments does the imposition of a normaliza- 
tion condition on the initial state vectors play an essential role. If we had started with 
nonnormalized states we should simply have been forced to renormalize the weights w, in 
order to relate them to the relative frequency function. Once the conventional probability 
interpretation is derived one then recognizes that the physical objects are the rays in Hilbert 
space and not the vectors themselves. 


Quantum Mechanics: Everett-Wheeler Interpretation 33] 


in the superposition sees the world obey the familiar statistical quantum laws. 
However, there exists no outside agency which can designate which branch 
of the superposition is to be regarded as the real world. All are equally 
real, and yet each is unaware of the others. These conclusions obviously 
admit of immediate extension to the world of cosmology. Its wave function 
is like a tree with an enormous number of branches. Each branch corre- 
sponds to a possible universe-as-we-actually-see-it. If I were facetious, I 
would now point out that human history has taken place along one of these 
branches, and hence human philosophy, and in particular the Everett- 
Wheeler interpretation of quantum mechanics, has, as it were, arisen from 
the formalism itself! This completes the proof of the Everett-Wheeler 
metatheorem. 

There remains to be discussed only the question of the practical 
application of the formalism. How does it happen that I am able, without 
running into inconsistencies, to include as much or as little as I like of the 
real world of cosmology in my Hamiltonian? Why should I be so fortunate 
as to be able, in practice, to avoid dealing with the wave function of the 
universe? The answer to these questions is to be found in the statistical 
implications of sequences of measurements of the kind which led to the state 
vector (38). Consider one of the memory sequences in this state vector (that 
is, one of the elements of the superposition). This memory sequence 
defines an average value for the system observable, given by 


Cs ...sw = 2, S(S3 51 ++ Sy) (48) 


Ss 


If the sequence is random, as it is overwhelmingly likely to be, then this 
average will differ only by an amount of order ¢ from the average 


<s> =) sw, (49) 
But the latter average may also be expressed in the form 


(5) = Clsly> (50) 


where |y)> is the initial state vector of any one of the identical systems and 
s is the operator of which the s’s are the eigenvalues. In this form the basis 
vectors |s) do not appear. It is evident therefore that had I chosen to 
introduce a different apparatus, designed to measure some observable r not 
equal to s, a sequence of repeated measurements would have yielded me in 
this case an average approximately equal to 


<r) = Clty (51) 


In terms of the basis vectors |s> this average is given by 


<r = Lestdsiels’ dey (52) 


332 BRYCE S. DEWITT 


Now suppose, however, that I first measure s and then perform a 
Statistical analysis on r. This could be accomplished by introducing a 
second apparatus which performs a sequence of observations on a set of 
identical two-component systems all in identical states given by the vector 
|‘¥,> of Eq. (7). Each of the latter systems is composed of one of the original 
systems together with an apparatus which has just measured the observable 
s. In view of the packet orthogonality relations (17), I shall find for the 
average of r in this case 


Cr> = C¥, [r|P1> = > w,<sirls> = tr (pr) (53) 
where p, is the density operator 
p, = > Is>w, <s| (54) 


KY 

The averages (52) and (53) are generally not equal. In (53) the 
measurement of s, which the first apparatus has performed, has destroyed 
the quantum interference effects which are still present in (52). This means 
that the elements of the superposition (7) may, insofar as the subsequent 
quantum behavior of the system is concerned, be treated as if they were 
members of a statistical ensemble. This is what allows us, in practice, to 
collapse the wave packet after a measurement has occurred. It is also what 
permits us to introduce, and study the quantum behavior of, systems having 
well-defined initial states, without at the same time introducing into the 
mathematical formalism the apparatuses which prepared the systems in 
those states. 

It is, of course, possible in principle (although virtually impossible in 
practice) to restore the interference effects by bringing the apparatus packets 
back together again. But then the correlations between system and apparatus 
are destroyed, and no measurement results. If one attempts to maintain the 
correlations by sneaking in a second apparatus to “have a look” before the 
packets are brought back together, then the state vector of the second 
apparatus must be introduced, and the separation of its packets will destroy 
the interference effects. You cannot get around the laws of quantum 
mechanics. 


REFERENCES 


1. H. Everett, III, Rev. Mod. Phys. 29, 454 (1957). 

2. J. A. Wheeler, Rev. Mod. Phys. 29, 463 (1957). 

3. N. Bohr and L. Rosenfeld, det Kgl. Danske Vidensk. Selskab, Mat.-fys. 
Med. 12, No. 8 (1933); Phys. Rev. 78, 794 (1950). 

4. P. A. M. Dirac, The Principles of Quantum Mechanics, Third Edition 
(Oxford University Press, 1947). 

5. B. S. DeWitt, Dynamical Theory of Groups and Fields, Gordon and 
Breach, 1965), pp. 16-29. 


Xl 


Progress and Goals in 
Renormalization Theory 


KLAUS HEPP 


Fock Space and Renormalization of the “‘Gali-Lee’’ Model 333 

Fun and Frustration with Relativistic Hamiltonians 338 

The Intriguing Consistency of Renormalized Perturbation Theory 345 
References 35] 


For twenty years renormalization has been a magic cure for the “‘ catas- 
trophies” of relativistic quantum mechanics. Since the pioneering work of 
Dyson, Feynman, Schwinger, and Tomonaga [1] we have learned to under- 
stand better the individual terms in the relevant perturbation expansions and 
finite relations among them [2]. Our expectations in the “better future” 
have been cast into frameworks as general quantum field theory [3, 4] and 
analytic S-matrix theory [5], and renormalized perturbation theory is very 
useful to illustrate some of their structure. Yet only slowly, progress is being 
made in building up models for relativistic quantum systems, whose existence 
can be controlled in rigorous mathematical terms. 

These notes are not addressed to physicists working in relativistic 
quantum mechanics: our exposition is not mathematical enough. The 
material was informally presented to fill the image gap between mathemati- 
cians and physicists, to provide some background to Prof. Lascoux’ lectures 
and because the author felt strongly that many consistent streaks of beauty 
in the present nontheory will eventually fit together. 


FOCK SPACE AND RENORMALIZATION OF THE 
““GALI-LEE” MODEL 


It is interesting to study quantum theories with an unbounded number 
of particles. Fock space is the natural receptacle for such theories (until 


333 


334 KLAUS HEPP 


one encounters Haag’s theorem [3]). In nonrelativistic quantum mechanics 
we define 
HF =H =C H 2 = #," = V(R?, dx) (1) 
with dx: Lebesgue measure, x € R®. 
H P= HY KH, = HI" 


where o, an stand for the symmetric and antisymmetric tensor product of n 
copies. The symmetric and antisymmetric algebras 


S(#,) = #7 = Cra A(#,)= #7 = Ox, (2) 


are called Fock spaces of one species of particles obeying Bose (symmetric) 
or Fermi (antisymmetric) statistics. Creation and annihilation operators are 


defined on any 2 «€ #, $8 = {@,"}, by 
(a°(p)b? Val Pis «++ Pn) = (0 + VY bn 4 (Ps Pas «+> Pn) (3) 


(a*(p)*6"),(P1, rey Pn) =(n)~ uw". Op = Pj)Pr- (Pi, ee Pn) (4) 
j= 


and similarly for Fermi operators a‘(p), a‘(p)* with a factor (—1)’ on the 
right-hand side of (4). Evidently fe L°(R°)-\w-a(f) =J dpf(p)a(p) and 
a( f)* = J dp f(p)a(p)* define operator-valued tempered distributions. The 
a‘(f), a'(f)* are bounded operators in #*, as it is most easily seen from the 
commutation relations: 


[a(p), a(q)*]. = 5(p — @) 
[a(p), a(q)]. = [a(p)*, a(q)*]. = 0 


where [A, B], = AB+ BA and “+” applies to fermions, “‘—’’ to bosons. 
Let Q5 = {Go = 1,6, =0,0 > 1}. Then a(p)Q, = 0 for all p. Qo is cyclic 
with respect to the smeared-out polynomials in the creation operators. In 
HW a-projective unitary representation of the Galilei group G 


(5) 


geG:xr~wvRx + vt +4, R € SO(3), ¢, aeR (6) 
trawt+b 


is trivially defined [6], the |-parameter group of the time-translations being 
generated by the free Hamiltonian H,: 


(8°), P 1, sss Ba) = exp( 7 : ze t)ba(Ps eee (7) 


H = fap a(p)*a(p) 


Renormalization Theory 335 


In this nonrelativistic framework quite interesting dynamical situations can 
be rigged up. The usual Schrédinger theory for identical particles with pair 
interaction leaves each #, invariant: let v(p) = c(— p)* e L?(R°*) and define 


yh fen +o 2) ae eoml leo) 


(8) 


Then V|#, gives the p-space representation of the sum of 2-body potentials. 
H, + V is self-adjoint in # and can, for ¢(p) = t(|p|), be imbedded as infini- 
tesimal generator of time-translations into a projective unitary representa- 
tion of the Galilei group. 

A very interesting illustration of renormalization is the Lee model [7], 
which in its Galilei invariant version [6, 8] incorporates several nontrivial 
features: 


a. Pure Hilbert space theory (no “ ghosts”’) 
b. Particle production and annihilation 

c. Point interaction 

d. Ten-parameter symmetry group 


The Fock space is # = #* @ H®, where #?® is the symmetric algebra 
over the I-particle space of the 0-particle and #* the antisymmetric algebra 
over the direct sum of I-particle spaces of the N- and V-particles. 

We have the following nontrivial (anti-) commutation relations: 


[N(p), N(q)*14 =LV(p), V(q)*14 = [0(p), (q)*]- =6(p-—q) = (9) 


The free Hamiltonian Ho Is 


2 2 
Hy =H,(U,) = fap —— N(p)*N(p) + [ap = 0(p)*0(p) 
1 


2 : 
i far( i Uo} V(p)*V(p) (10) 


The elementary interaction is the tranisition N+ 0<+V. Let y € S(R°) [9] be 
a spherically symmetric cutoff function with 0 < y < | and define 


H,* = do fap, dp. xX(pi + palV(pi + P2)*N(p1)0(p2) 


+ N(p1)"0(p2)"V(pi + pad) (11) 


The full Hamiltonian is H* = Hy + H,*. Then there are two para- 
meters in the theory: the ‘“‘bare”’ internal energy U) of the bare V-particle 
and the bare coupling constant 1, between the bare N-, 0-, and V-particles. 
The diminutive ‘‘ bare’? expresses the fact that in the free-particle basis of 


336 KLAUS HEPP 


Fock space the full Hamiltonian is no longer diagonal and that physical 
particles correspond to (generalized) eigenstates of H*. The cutoff x is 
necessary to make H,* densely defined: otherwise H,* would be inapplicable 
to any state with a finite number of bare particles with at least one of the 
V-species. 

There are important symmetries in the theory. Apparently the number 
operators N, =Ny+Ny, N2=Ny+N, commute with Hy and H,’. 
Therefore the dynamics factorizes in subspaces #,,,,, of eigenvalues n,, n, of 
these operators: 


H = ) Fe us (12) 
ni, n2=0 
H,* restricted to any #,,,, 1s bounded and therefore H” is self-adjoint. If 
one requires [10] 


m;=m,+m, (13) 


for the masses m,, m,, m3, of the bare N-, 0-, and V-particles, then H* can 
be again imbedded into a projective unitary representation of the Galilei 
group. 

Evidently, interaction occurs only in those subspaces #,,,, where 
n,-n, # 0. Therefore the vacuum is an eigenstate of H*% and bare and physical 
N- and 6@-particles coincide. The first nontrivial sector has n, =n, = 1. It 
will turn out that the necessary reinterpretation of the theory in the limit of 
point interaction (y > 1) can be entirely defined from the (1, 1)-sector, or in 
physical terms: the internal energy and charge renormalization in this sector 
carries consistently through into all #7. 

Let |N (p1)0(p2)> = |NO(p, q)> = N(p1)*0(p2)*Qo and |V(p)> = V(p)*Qo 
be the improper basis of #, ,. We have separated the center-of-mass move- 
ment: p=p,+p2, g=m;'(mzp, — mp2). The eigenvalue problem for 
H* in #,., can be solved explicitly [7]: there exists a discrete eigenstate 
|[V (p)>*, the physical V-particle, 


2 
Hv(p)y = (2 + U,) Mor» (14) 
m3 
and a continuum of N- @-scattering states 
2 
HINO(p, g)>’ = (2 +) INO, Y4 (15) 
m3 


of energy w in the center of mass system (w = p?/2m, m,m=m,m)). 
The “+” stands for outgoing or incoming boundary conditions: for 


Sisto E S(R°) 
INL )0F2)>% = fap: apf (pd f(P2)IN(pOp2)>% (16) 


Renormalization Theory 337 


is the wave packet of the interacting N- 0-system_ at time ¢ = 0, which for 
t— +00 can be described by a freely moving N- and 6-particle with wave 
packets f,, f,, or in the language of Hilbert space 

lim |je"**|N(f,)O(f2)>% — e'" |IN(f1)0(f2)>I| = 0 (17) 


t>+to 


The solutions of (14) and (15) are given by 


1 — 71/2 _ x(w) dq 
IV(D)>* = Zz" 1V(p)> — Ay [BG INP. >. (18) 
| AM) 
INO(p, 9)>% = |NO(p, g)> + 7 0) 


«(zy IV +2, FAP iworp. g’>} 9) 


where 
Z7eER yao A "Ao 
and 
dqx(o) 
a = 2 
Date. [5 oy (20) 
dqx(w)? 
enn See a Coa (21) 
= 2 x(w)” dq 
Mod = 2.422 [GT N@ =U " 


The requirement that this solution stay in Hilbert space is 0<Z, <I, 
which is possible in the finite range 


1 (/-U\'/4 
A|<A.=-(=G 
| a c -(=) 
even in the limit 7 — 1. 


The observable parameters of the theory are 1, and U,: 4,’ is deter- 
mined by the limit w—0 of the N — @ scattering amplitude (as the fine- 
structure constant a = e*/4z in quantum electrodynamics) and U, <0, the 
internal energy of the physical V-particle, is measurable by thresholds in the 
N6@6@-scattering. Therefore a reasonable limit of the theory for y—1 is 
expected, if one keeps the physical parameters fixed: A= 4,, |A| <A,, and 
U=U, <0 and varies Ay = A,(A, U, x) and Uy = UA, U, x) accordingly. 
This is the basic idea of renormalization theory, which has been so extremely 
successful in quantum electrodynamics. 


338 KLAUS HEPP 


What happens for y-1? Nothing really in terms of the generalized 
eigenfunctions |V(p)>* and |N0(p,q)>%! They stay a complete system 
of improper vectors in #, ,, and if one defines 


2 
exp (iH't)|V(p)>' = exp i(e + U)r1V(p)>' 
3 
p? (23) 
exp (iH ")|NO(p, a)>': = exp i( 2 + w)rINOp, >’ 
then one obtains by Stone’s theorem a self-adjoint renormalized Hamiltonian 


H'. Formally we see from 


H* = H,(U) + H,% +(U, — U) | 4pV(p)*V(p) (24) 


that H' is obtained by the subtraction of an infinite self-energy counterterm. 
In the limit 7 > 1 the Haag-—Ruelle asymptotic condition (17) remains valid, 
while exp ({Hot)|V> only converges weakly toward exp (iH't)|V>' + Z}/?. 

The next interesting sector is the N00-sector #, ,. Here the proposal 
is first to determine the scattering states 


HV pampars = (L- +0 +22)iv(na pr 5) 


H*|N(p,)0(p2)0(p3)>% = (E rl fe + ps IN(p1)0(p2)0(ps)>% (26) 
and then to prove completeness of these generalized eigenfunctions in 
Hand continuity in y. Finally one would define H' by its spectral pro- 
jections. 

Schrader [8] has ‘‘solved” the singular integral equation for this 
3-body problem with recoil by estimates similar to those of Faddeev [11]. 
His results are positive for sufficiently small values of |/|, proving the con- 
sistency of renormalization (24) in the first nontrivial sector. Although the 
coupled integral equations for the higher sectors become discouragingly 
complicated, our experience with the Schrédinger N-body problem [12] 
makes us believe that the renormalized Gali-Lee model is well-defined and 
asymptotically complete for fixed U<0O and sufficiently small coupling 
constant |A| < A. = A.(n,, n2). 


FUN AND FRUSTRATION WITH RELATIVISTIC 
HAMILTONIANS 


It is tempting to pursue the same approach as in the Lee model and to 
explore relativistic Hamiltonians in Fock space. Although we shall encounter 


Renormalization Theory 339 


immediately formidable difficulties, it appears that in modern mathematical 
language, supported by powerful estimates, the classical difficulties of 
Hamiltonian quantum field theory can slowly be dissolved. 
An interesting class of potentials in Fock space are polynomials in 
smeared-out creation and annihilation operators: V= } V, , 
k,l 
k+l 


Vint = {1 dp; Vk, MP1» eas Pr+da(p1)* en aC py) *a( Dy 4 1) wees (Peat) (27) 
V is densely defined, if, for example, r,_,¢ L7(R°“*”), and symmetric, if 


Ve. (Pi sees Prod” — UC Prat Sage) 

The first difficulty arises, if one wants V to commute with all space- 
translations U,(a, 1) in Fock space. U (a, 1)VU,(a, 1)~' = V implies that 
v,., depends on p; + °** — py4, Only via a 6-function. Then, however, every 
V9 #9, k #0, leads out of Fock space, since the 6-function in the wave 
functions of the images is not locally square integrable. Fock space 
translation invariance can hold only for potentials without ‘‘ vacuum polari- 
zation,” for which VQ, = cQ, holds, as in the Lee model. This observation 
can be refined to Haag’s theorem [3]. 

A second difficulty occurs if v, , is not sufficiently decreasing at infinity 
and k>1. Such an “ultraviolet” (UV) divergence was the main obstacle 
in the Lee model in the limit of point interaction. Since all sets in R* except 
for {0} invariant under the homogeneous Lorentz group L,* are unbounded, 
One is (in contradistinction to Galilei-invariant theories) quite naturally led 
to point interactions in relativistic theories. Sometimes it will be possible 
to cure the UV divergences by renormalization. 

Relativistic quantum mechanics in Fock space can be built on a rich 
class of operator-valued distributions with a very attractive property— 
locality. In the simplest case of a neutral scalar particle of mass m, the 
I-particle space #, is L?(R°, m(p)~'dp), where mi(p) = (mi? + p?)'/?, m>0. 
We distinguish vectors in R* (p,k, x) and their space (p,k, x) and time 
(p°, k°, x°) components. The Poincare group iL,* operates on space-time 
by translation and (Lorentz-) rotation :(a, A): xe R* ~w-Ax +a, where 
ae R* and Ac SL(4, R) leaves (x, 1) = x°y)'° — x- y invariant as well as the 
direction of time. #, carries a unitary irreducible representation U,(a, A) 
of iL,*: 

(Ula, A)f)(p) = exp (i(p, a))f(A~! p)|po=mps (28) 


Our Fock space is # = S(#,), with Q, = {1,0,...} as vacuum state and 
U,(a, A) canonically extended to #,. Creation and annihilation operators 
are defined as in (3) and (4) with (n + 1)'/2, (n)~'/? replaced by 


re De a 
(i) Fan, 


respectively. 


340 KLAUS HEPP 


The free field 


d ; 
Ag(x) = (209? J nD {a(pe* + a(p)e"”} (29) 


satisfies as operator-valued tempered distribution on a dense, invariant 


domain 
(CO + m’)A,(x) = 0 


(30) 
U (a, A)Ao(x)U o(a, A)” * = Ao(Ax + a) 
Oo? 
where Cj = 5,02 ~ A 


In addition, A, is local, due to the choice of symmetric statistics [13], that is 
[Ao(x), Ao(y)]_ = 0 for spacelike x — y, (x — y, x — y) <0. Ag decomposes 
naturally into a creation (A,) and annihilation part (A_). Then Wick 
monomials of A, are defined by 


: Ao(x,) °°: Ao(x,) : = ee > I] A(x) |] 4-Cx) (31) 


{1,....n} ie X 
: A(x)": is again a tempered operator-valued distribution and satisfies [14]: 
U,(a, A): Ao(x)" : Uo(a, A)7' = : Ag(Ax + a)”: (32) 
[: Ao(x)":, : Ao(y)”:]_ =0 forall m,n and all(x—y,x-—y)<90 


We shall see soon why Wick polynomials of free fields and their generaliza- 
tions [15, 16] are attractive building stones for relativistic Hamiltonians. 

Our goal is to construct a model for an interacting relativistic quantum 
system. For a theory of one kind of massive neutral scalar particle the 
following general requirements are often introduced as “axioms” [17, 3, 18]: 

(A): The Hilbert space of states # has to carry two basis systems of the 
structure of Fock spaces over L7(R°, m(p)~ ' dp): 

H= OHV = OH." (33) 
n=0 n=0 
Hr and #0" are the subspaces spanned by incoming and outgoing n-particle 
scattering states. The unitary representations of iL,* connected with #'™ 
and #° are to coincide: Ui"(a, A) = U$""(a, A) = U(a, A). There has to be 
particle production: #P 4 #7". 

(B): A local relativistic quantum field A(x) should “interpolate” 

between Ai" and A®" by the Haag—Ruelle asymptotic condition [4]: 


[A(x), AY)]- =90 for (x-—y,x—-—y)<0 
U(a, A)A(x)U(a, A)~* = A(Ax + a) (34) 
P,A(x)Q 4 0 


Renormalization Theory 341 


where QD = QO = 6" and P, is the projector on #'" = #%. The relations 
(34) are to be valid as operator-valued generalized functions [18] on a dense 
invariant domain containing Q. 

The concentration on local fields (or local rings [19, 20] with a more 
general attitude) is as reactionary as our outline of the construction of 
relativistic models and perturbation expansions. It is the personal belief 
of the lecturer that the modest, but always reasonable, predictions based on 
(A) and (B) can be justified and extended to a qualitative understanding of 
relativistic dynamics, once we master the constructive approach to quantum 
field theory. 

The simplest reasonable potential V for a relativistic Hamiltonian is 
the A* interaction in 2-dimensional space-time. Here only a space cutoff is 
necessary: 


V(R) = 4 | dxhg(x) : Ao(x)* : (35) 


Here Ae R is the coupling constant and for hp € S(R') we assume hp> 0 
and hp(x) = 1 for |x] < R. V(R) is a real, symmetric operator having a 
common dense domain with Hy, as one sees in p-space: 


4 d j 4 : 3 : 
VR) =2 fT TE had po{ TI a(po* + 4T] a(p)*at— pad 


2 4 4 4 
+ 6] a(p)* [] a(—po + 4atp.) [L a(—pd + [] pd} G6) 


Thus H(R) = H, + V(R) has at least one self-adjoint extension. A proof of 
the essential self-adjointness of H(R) on some standard domain would be 
desirable to support the following heuristic discussion [21]. 

One defines Heisenberg field operators 


Aj(x)" = exp (iH(R)x°) : Ao(x)” : exp (— iH(R)x°) (37) 


If R is sufficiently large, one obtains a local solution of the superficially 
attractive field equation 


(DF + m*)Ag(x) = 42AR(x)? (38) 
as one sees from the free-field commutation relations: 
0? 
5,02 Ap(x) = —exp (iH(R)x°)[H(R), [H(R), Ao(x)]-]- exp (—iH(R)x°) 


[Ao(x), : Ao(y)” :]_ =0 (39) 
[[Ho, Ao(x)]-, :Ao(y)” :]_- = 1: Ao(y)""': d(x — y). 


More important is the observation [22] that at least in perturbation theory 


342 KLAUS HEPP 


A p(x)" is independent of R, for sufficiently large R, and local with respect to 
Ap(y')". The perturbation series in question can be obtained by iterating 


t 
U p(t) = etHete-tH(RM | ej { dsV(R, s)U,(s) (40) 
0 
where V(R, t) = exp ((H)t)V(R) exp (—iHot). One obtains 


en HR pitt 5 (_ jy" { as [a [va V(R, t,) + V(R, th) 
— 1 2 es n ae | > *n 
n=0 0 (0) O 


xo (41) 
Ag(x) = U(X)” 'Ao(x)U (Xo) = Ao(x) + i j, dtLV(R, t), Ao(x)J- + °°: 
The multiple commutators on the right-hand side, as 
[V(R, 1), A(x] = i [dyha(VE: Alt, yt AoDI- (42) 


depend only on /,(y) for |y| < |xo| + 6, 6 > 0, since all operators are local 
with respect to each other. These results and the manifest Lorentz co- 
variance of the Gell-Mann-Low (GML) series [see (57)] motivate our interest 
in local combinations of creation and annihilation operators for relativistic 
Hamiltonians. 

In this approach, the construction of a local quantum field theory 
has to proceed in several steps: First one constructs in Fock space using the 
H(R) C*-algebras [19] of local observables 2{(O) for all bounded open 
regions Oe R*. The norm closure 2 of () 2((O) would have the Euclidean 

O 


group (space translations and rotations) and time translations as a continuous 
group of automorphisms, with the latter only locally unitarily implementable 
[22]. Having found a state w on QI invariant under this automorphism group, 
one constructs a new representation z of Qf in a Hilbert space with cyclic 
invariant vector state Q by the Gelfand-Segal construction 


(Q, 1(A)Q) = a(A) AEA (43) 


In this representation the space-time translations (a, 1) are unitarily im- 
plemented (a, 1) ~w-exp i(P, a). If the spectrum of P lies in the forward 
light-cone and has isolated mass values, then Araki [20] has shown how to 
construct scattering states for any number of incoming or outgoing particles. 
To end our fiction, the S-matrix is expected to be nontrivial and the theory 
hopefully Lorentz invariant. 

Before this program can be carried through, UV divergences have to be 
removed without destroying locality, unless the potential is a pure Wick 
polynomial of scalar fields in 2-dimensional space-time. The simplest UV 
divergences occur for the Yukawa coupling of a 2-component spinor field 
Y = (¥!, ?) with a scalar field A in two dimensions. The renormalization 
of this model has been masterfully carried out by Glimm [23]. 


Renormalization Theory 343 


Due to the particularities of 2-dimensional space-time (L,* is Abelian), 
the l-particle spaces for nucleons and antinucleons (WV, W) and mesons (.@) 
are 


N=N=L(R',M(p)'dp) M@= L(R', mp)! dp) (44) 
where 
M(p)=(M? + p*)''? ss m(p) = (m? +p’)! M,m>0 
In the Fock space 
H =S(M)® AN ON) 


the creation operators for mesons are a(k)* and for nucleonsand antinucleons 
b(k)*, b(k)*. The free meson field A, was defined in (29) and the free 
fermion field ¥, = (¥,', VY”) is 


val 2M(p) 
M(p) — 
— 7 i IMD 


The choice of antisymmetric statistics for the fermions makes the P,‘(x) 
local: 


W(x) = Mate)” {b(p)e"""") + B(p)*e" } ,0= mip) 


(45) 
1/2 ; ; 
) {b(p)e=#™ — B(pyte”™) on mip 


[Yo (x), Por)]+ = [¥o(™)*, ¥o(y)], = 0 (46) 


for all | < i, 7 < 2 and all (x — y, x — y) < 0. The free Hamiltonian is 


Hy = : dp{m(p)a(p)*a(p) + M(p)b(p)*b(p) + M(p)b(p)*b(p)}. (47) 


One sees from (45) that the interaction density 
V(x) = A: Bo'(x)*Po7(x) + Po(x)* Ho (x) | Ao(x) 


integrated over x with a space cutoff h(x) € Y(R'), does not define an oper- 
ator in #. 

V(x) describes eight virtual processes: #+NV+No0, V+NVo 
M,N +MAN,N+MaN. In | dxh(x)V(x) the processes 0+ NW + 
N +M and VN +N +W are effected by the formal operators: 


Q,2.=4 {0 ap; b(p,)*b(p2)*a(p3)q(p,, P2» + ps) (48) 
where 
Q(P1> Po» Ps) = ~s/2 ACY pi)M(p)M(p2)m(ps)) 1 
x (M(p1)M(p2) — Pip2 — M*)"” sgn (pi — p2) (49) 


344 KLAUS HEPP 


FIGURE 1 


does not belong to L?(R°). In perturbation theory, for example, in the GML 
series, no UV divergences would occur if Q, and Q, were better behaved. 
In fact, the theory is ‘‘ overrenormalizable”’: in (57) only one Feynman graph 
and all its insertions into higher-order graphs are UV divergent. One is led 
to make first H, = Hy + Q, + Q, a bona fide operator in Fock space and 
then to deal with V, =V-—Q,-—@Q,. An effective UV cutoff xk is the 
multiplication of a(k)*, b(k)® and b(k)™? in V by the characteristic function 
of {|k| <x}. Then A,, = Ho + Q1, + Q», 1s densely defined in #. 

In the Lee model we were able to diagonalize H*% in the (1, 1)-sector 
and to renormalize the theory by “calculating backward.” Instead of using 
the Friedrichs series [24] for finding an operator U, such that HU = UH,, 
Glimm constructs from the first- and second-order terms in this series an 
approximate diagonalization 7, of H;,,: 


(H,, + R,)T,, = d 6 Hy a Py (50) 


Here the F, and 7, are defined on a dense domain 9 common with H, for 
K < 00 and converge strongly on 9 for k—oo. R, is the counter term to 
renormalize H,,, which on T, 2 has the simple form 


R, = 6m,2 | :A,(x)?: h(x)? dx +¢,1 (51) 


dm,” is a meson mass renormalization, which diverges as log x for Kk > 00, 
while c, copes with vacuum fluctuations, c, — oo for kK + 00, which would 
cancel in the GML series. The main result of Glimm is the following: 


Theorem. For all de J, ¥, = T,¢, ¥ = s-lim ¥, = Td 


s-lim (H,, Se R,.)P re H,, Be (52) 


K7>@ 


exists and defines H, ,., aS a symmetric operator with dense domain 9, = TZ. 


Renormalization Theory 345 


The proof involves a long chain of nontrivial estimates. Next, the re- 
mainder is estimated as a bilinear form on T, D x T, D,«K < co. The limit 


lim (Y,, ’ (H,,, + Ry + V2.4) Px) (53) 
is shown to define a symmetric bilinear form on 9, x Y,. After an addi- 
tional finite renormalization (R,— R,' with dm?’ — 6m,?, c,'—c,éR, 
fixed), the final result is the following: 


Theorem. There exists a positive self-adjoint operator H,.,,, whose bilinear 
form extends bes (P.. (A, +R, + V2,)'¥,). 


Thus, both | for the A* and Yukawa interaction, one has in two dimen- 
sions reasonable candidates for a local relativistic dynamics in the Heisenberg 
picture. However, very serious difficulties will be encountered in the 
renormalization of these Hamiltonians in a physical 4-dimensional theory. 


THE INTRIGUING CONSISTENCY OF RENORMALIZED 
PERTURBATION THEORY 


In this final section we shall outline how Feynman integrals appear in 
relativistic quantum mechanics and how gracefully renormalized perturbation 
theory survives the most violent manipulations. 

Let us start with the Yukawa interaction in 4-dimensional space-time. 
Cutoffs are imposed as in Glimm’s investigations, a space cutoff R and 
an UV cutoff x at least for the fermion fields. Since the regularized Fermi 
fields are bounded operators, the only unboundedness of V,(R) arises from 
the linear dependence on the smeared-out meson creators and annihilators. 


Their Nu growth (Ny,: meson number operator) is infinitely small in the 
sense of T. Kato [25] with respect to Hy. Y. Kato [26] and O. E. Lanford 
[27] have proved the convergence of the perturbation series (41) for 
exp it] Hy) + V,(R)] for all Ae C, in the norm topology of @(D,, Dz), «> 
B>0O. The D, are a chain of Banach spaces dense in # (D, > D(exp BNy) 
for all B <a). Ifthe space cutoff is effected by considering the theory in a 
finite space volume with periodic boundary conditions, then Hp + V,(R) 
has a unique ground state Qe D, for sufficiently small |A|, which is again 
in D, given by a convergent perturbation expansion [25] 


Q = y 70, (54) 


Physically, the most interesting quantities of quantum field theory are 
the vacuum expectation values (VEV) of the time-ordered products of field 


346 KLAUS HEPP 


operators. Fora scalar field satisfying (A) and (B), the time-ordered product 
TA(x,):+:: A(x,) is an operator-valued generalized function which for 


XP, —oXpper FO>O Lek<n-t {(i, 1),...,(,m} = {l,...,n} 


coincides with A(x; ,)°::A(x;,,). The extension to Y’(R*") can for the 
following theorem be fairly arbitrary. In perturbation theory this question is 
closely related to renormalization [see (68)]. Truncated time-ordered VEV 
are recursively defined by 


(TA(x,) +++ A(x,)>" = (Q, TA(X1) +++ A(Q%q)Q) 
= d «TA(x;,1) isi A(X;,4)>" “t (T A(x; 1) < A(x;, ae (55) 


where )_ extends over all partitions of {1, ..., 1} into more than one set, and 
P 


the recursion starts from <TA(x)>? = (Q, A(x)Q). Let t(p,,..., p,) be the 
Fourier transform of (55). Then one has the following [17, 28]. 


Theorem. In a quantum field theory satisfying (A) and (B) 


CPi» vais Pin | Pm+1> eons p>? _ Cm nl] ((Pi+ Pi) = m?)t(py, ae —P,) 


pi? = m(pi) 


(56) 


Cm,n 1S Some constant, and the left-hand side is the nontrivial part of the 
scattering amplitude for the process 


Diack OF Dy Pi Dn 


after having subtracted recursively as in (55) all contributions from independ- 
ent scatterings of disjoint subsystems. 

In the cutoff Yukawa theory, by combining (41) and (54), one obtains 
a convergent perturbation expansion for <7O,(x,):::O,(x,)>', O; = ‘V,* 
or A, unfortunately, with a radius of convergence which depends very 
unfavorably on both cutoffs. However, this series is identical [27] with 
another, marvellous series, where for all finite partial sums: 

(a) The volume cutoff can be trivially removed. This holds for every 
local interaction density AV(x). Hence one automatically reaches in 
perturbation theory the non-Fock representation of the canonical com- 
mutation relations which belong to the interaction AV (x) [30]. 

(B) The UV cutoff can be consistently removed within the ambiguities 
in the definition of time-ordered distributions. 

The expansion in question is the Gell-Mann-—Low series [29]: 


(TO,(x,) _ On(Xm)>* 
co (_j))n-™ n 
=m — pil AX<TO94(%1) *** Oom(Xm)V(Xm+ 1) °° V(x,)0 (37) 


m+1 


Renormalization Theory 347 


On the left-hand side are VEVs of Heisenberg fields, truncated with respect 
to the physical vacuum Q, and on the right-hand side the VEVs of the corre- 
sponding free fields truncated with respect to the Fock vacuum Q,. (a) can 
be proved by inspection for UV cutoff interactions V(x) = [ dye(x — y)V()), 
gy € S(R*), which are obviously translation-invariant. Concerning (f), one 
remarks that the VEV <O,,(x,):::V(x,)>2 is a well-defined tempered dis- 
tribution [14]. Furthermore, Wick’s theorem [31] is certainly valid in the 
following typical case: 


Theorem. Let V(x) =: A o(x)*:. Then for test functions in S(R*") with 
support in some {|x,° — x,°9| >65>0,1<i<j<n} 


«T Ao(X1) Ao(Xin)V(Xm+ 1) V(x,)>0 ata y I] iA (x; — X;) m= O(2) 
(58) 


Here ) extends over all partitions of 


1 4 1 4 
LU eee e SU aie Dia teasce Up cee Ont 


into sets of two elements, such that (1) v/, vo, r # Ss, never lie in the same set; 
(2) after the identification v;, =v;, 1 <r< 4, the n “‘vertices”’ v,,..., v, are 
connected by chains of sets {u;, v;}. 

For every {v;, v;} a “causal” distribution [32] or ‘ Feynman propaga- 
tor” contributes: 


— _ i (ap exp — i(p, x — y) 
A(x — 9) = CT Ags) A0(9)90 = lim | a 


To every | | iA,(x; — x;)correspondsa Feynman diagram [33]. Thev,,...,v, 
are represented by n distinct points in a plane and every iA,(x; — x,;) by a 
line connecting v; and v,. 


(59) 


Example 


If one takes (58) seriously as a tempered distribution, by Fourier transform, 
one would obtain for |] ‘A(x; — x;) a Feynman integral of the type studied 
in Prof. Lascoux’ lectures: 


K L 
elim { [] dk; [1 (aj, 4)) — m? + ie)! (60) 
e10 i=1 j=1 


Here p;,..., P, are the variables conjugate to x,,..., x,, and the q, for every 
line are linear combinations of the p,,..., p, and of K 4-vectors k,,..., kx, 
if the graph has K independent cycles. The factor c contains 6() p;). For 
more general interactions V(x), the propagators contribute 


P(q,)L(q;; qj) = m + ie]! (61) 


with some polynomial P, and the vertices v; some constant “vertex part”’ 


348 KLAUS HEPP 


FIGURE 2 


X;. In general a Feynman integral is not even conditionally convergent at 
infinity because of UV divergences. After a Laplace transform 


m 
P(q;\((4;,9;) — mj? + ie)! = : P (q;) j, da; exp ix,[(q;, qj) — m,? + ie] 
(62) 


the multiple Gaussian k-integrals can be carried out and lead to an a-inte- 


grand 
L 


1a, p) = 3( ¥ pi) RCo prexp i( ¥ Aifan(ri.p)— Lam? — i) (63) 


For r, € > 0 (63) is absolutely integrable in D, = {a;>r,1<j<L}. The 
UV divergences appear as nonintegrable singularities on the boundary of Do. 
If one studies for p € A(R*) and " its Fourier transform 


({]iA., ~) = om aye ie I] dz, fu Ap; @(Pi5-++s Pal (a, p) (64) 
eg 

then it can be shown we that (64) is well-defined and continuous for all 
pe F,(R*") [the subspace with the induced topology of all oe Y(R*"), 
which vanish with all derivatives of order <N, whenever v;=.x,,/#4 J]. 
Renormalization can be interpreted as an extension of this functional from 
S(R*") to A(R*"), where the arbitrariness of the Hahn-Banach extension 
is strongly reduced by the requirements that (a) the axioms (A) and (B) of 
quantum field theory should be valid up every finite order in the perturbation 
expansion in /; (b) the modifications of the Wick expansion of the GML 

series should be minimal. 
A class of extensions of (64) satisfying (a) and (b) has been given by 
Bogoliubov and Parasiuk [34] for an arbitrary local interaction density V(x). 


Renormalization Theory 349 


Consider an arbitrary, fixed graph G(¥~) withaset Y of verticesr,,...,t, 
and a set ¥ of lines /,,...,/, connecting them. We look for counter 
terms to make the Wick expression /,(x, p) absolutely integrable over Dy 
which do not contribute in (64) for p€é L,j(R*"). The combinatorics of the 
counter terms will be controlled by the subgraphs G(W’) of G(W), which are 
defined by all lines from ¥ connecting the vertices r,’,...,tj,'€W"’. For 
every ¥' CV the “superficial divergence” 1(’) is defined by 

WV") =S' (r; + 2) — 4m’ - 1) (65) 
where ¥’ has mm’ elements, r; is the degree of the polynomial P; in (61), and 
y.’ extends only over those lines which connect vertices from ¥”. 

In general, a counter term has to be introduced for any partition 


Wi5++-5V,0f ¥. When integrated over D,,r > 0,the counter terms ¥"°. 
will have in x-space product structure: 


k 

FV, VO)= I] vas ro Tiny (66) 
For every line which connects vertices in different ¥,,...,W,, the contri- 
bution A’*, is the Fourier transform of (62) with the x-integration restricted 
to [r, co). The ‘‘ generalized vertex parts” 7" "(W’;) are defined recursively: 
LV) =X, if V'. = {r;} forsome l<ign. *"(V"') =0, if G(VW’) is 
one-particle reducible, that is, if G(”’) is disconnected after deleting one of 
its lines. In all other cases 


YP pices Me ed) (67) 


PV") = -m( 

kK(P)>1 / 
The sum in (67) extends over all proper partitions Wp ,,...,W pp, of 
¥"', The condition &(P) > 1 makes (67) into a recursion, starting with the 
original vertex parts .27;, once the operation M is properly chosen. Let ¥”’ = 
{r,,..+, Uy}. By induction, every #"*, on the right-hand side of (67) is in 
p-space of the form: ¥"(p,, -.-, Ps) =lPat**' + PF" (p), where 
F"®&e@,, [9] is holomorphic. Under M, every F’’* is replaced by its Taylor 
series up to the order 1(W’), inclusively, in (p,,...,),) around (0,...,0). The 
Taylor coefficients depend on r,eé and diverge for r|0. This definition of 
x" "(¥"') can be extended by a finite renormalization by adding to (67) a poly- 
nomial of degree \(¥’) with coefficients continuous inrande>O. It isclear 
that none of the F" (¥Y,,...,.W%,), except for F” “({r,},..., {t,}), contributes 
in (64) for N sufficiently large, since they are all in p-space polynomials in 
a subset of variables. 


Theorem. Let V(x) be an arbitrary local interaction and G(¥) a graph in the 
GML series. Define as regularized finite part 
R (Vv) = »: ae (V p45 ee eg Vp. np) (68) 


P 


350 KLAUS HEPP 


with summation over all partitions P of YW. Then in p-space (up to 6()_ p;)) 
lim 2” (VW) is holomorphic and tempered in p,,..., p,, and lim lim 2” (VW) 
r{0 elO rlo 
exists in S’(R*”). 

The partial sums of the GML series are sums of Feynman integrals of 
the type 


da; R(a, 
lim | — 1] ae i P) z 
a > Apis Pj) - By; + ie] 
ij=l j=1 


Here C is compact, R(a, p) is rational in «, a polynomial in p and locally inte- 
grable over C, A; (a) is rational and bounded in C and B,(a) a polynomial with 
) B;m,; > min {m,;} in C. The partial sums of the GML series are holo- 
morphic for real p,,...,p,, except for multiple scattering singularities [36]. 


(69) 


The proof is combinatorically involved [34, 35], but uses only elemen- 
tary estimates. The requirements (A) and (B) of general quantum field 
theory are satisfied [37, 38] up to distribution-theoretic technicalities. 

The minimality of the Bogoliubov renormalization is best illustrated by 
the following interesting result of Speer [39]: In the theory of generalized 
functions [40] it is natural to study regularizations of the type 


A‘ *(q) =TAje™ "YP (q)L(q, q) — mj? + ie 
1 eo “2 . 
7 P(q) j, da, aj)" exp ia,[(q, q) — m,? + ie] (70) 


For Re A, sufficiently large the convolutions in (60) with these regularized 

propagators converge. Speer obtains a distribution-valued analytic function 

F (A, ..., A,) which is meromorphic in C* with poles only for ) 4; =1(a@ 
M 


any subset of #, t any integer). Of interest is the behavior for (/,,...,4,) > 
(1,...,1), where the symmetrized constant term of an iterated Laurent 


expansion can be computed by 
F At, ...,4 
di, se di, VA; L) 
c 


! Per 
pe J re Ar —- DAL —D 


using suitable contours C; and summing over all permutations P of (1,..., L). 


(71) 


P(1) 


Theorem [39]. B(Y) is, for some definite choice of the finite renormaliza- 
tions, equal to the symmetrized constant term in the iterated Laurent expan- 
sion of F,(A,,...,4,) around (1,..., 1). 

We finally come to an important distinction. In every order n of the 
GML series the counter terms in &(W) arise formally from counter terms 


Renormalization Theory 351 


to V(x) which are Wick polynomials W,(x) in the free fields O,; and their 
derivatives. A theory is called finitely renormalizable if the Wick 
powers in W,(x) stays the same for all orders n. Here the minimality of 
the renormalization is essential. In a finitely renormalizable theory, the 
time-ordered distributions in every order 1 are uniquely determined by a 
fixed, finite number of parameters. Typical in 4-dimensional space-time 
are the A* and the Yukawa interaction, where the parameters are the masses 
m, of the physical particles and the normalization of the matrix elements of 
the field operators between vacuum and one-particle states and the physical 
coupling constants (see for example [35]). Exactly such parameters have 
defined in the Lee model the renormalized Hamiltonian! 

For finitely renormalizable theories a number of different renormaliza- 
tion schemes have been devised: 


1. The Dyson-Salam rules in p-space [41, 42] 

2. The differentiation method [43, 44, 45, 46] 

3. The definition of the currents for local field equations [47, 48, 38, 49] 
4. The Bogoliubov scheme 


Although the combinatorial structure of the subtractions is of extraordinary 
variety in these different approaches, there is good evidence that they all lead 
to the same renormalized GML series. 

This ‘‘structural stability’? of renormalized perturbation theory begs 
the question, whether there exist exact nontrivial solutions for some infinite 
system of equations, which couple the time-ordered distributions and which 
are suggested by perturbation theory. The advantage of this approach 1s the 
close connection to the scattering amplitudes and the absence of the vacuum 
problem, the difficulties come from the known divergence of the GML series 
for bosons [50] and the, at best, conditional convergence of p-space integrals 
in Minkowski space. Presently, this is a field of interesting possibilities with 
beautiful applications of nonclassical analysis (51, 52]. 


REFERENCES 


1. J. Schwinger, Quantum Electrodynamics, Dover, New York, 1958. 

2. J. Lascoux, this volume Chapter XIV. 

3. R. F. Streater and A. S. Wightman, PCT, Spin & Statistics and All That, 
Benjamin, New York, 1964. 

4. R. Jost, The General Theory of Quantized Fields, Am. Math. Soc. Provi- 
dence, 1965. 

5. R. J. Eden, P. Landshoff, D. I. Olive, and J. Polkinghorne, Analytic 
S-Matrix Theory, Cambridge, 1966. 

6. J. M. Lévy-Leblond, Comm. Math. Phys. 4, 157 (67). 


352 


KLAUS HEPP 


D. Lee, Phys. Rev. 95, 1329 (54). 


7. T. 
8. R. Schrader, Thesis, Ziirich, 1967. 
9. L. 


17 


Schwartz, Theorie des distributions, Hermann, Paris, 1957/59. 


. V. Bargmann, Ann. Math. 59, 1 (54). 

. L. D. Faddeev, Trudy Stehklov Math. Inst. 69 (1963). 

. K. Hepp, to be published. 

. W. Pauli, Rev. Mod. Phys. 13, 203 (41). 

. L. Garding and A. S. Wightman, Ark. Fysik 28, 129 (64). 
. H. Epstein, Nuovo Cim. 27, 886 (63). 

. A. Jaffe, Ann. Phys. 32, 127 (65). 


H. Lehmann, K. Symanzik, and W. Zimmermann, Nuovo Cim. 1, 205 


(55); 6, 319 (57). 


18 
19 


20. 


21 
22 
23 
24 
Soc. 


w 


A. Jaffe, Phys. Rev. 158, 1454 (67). 

R. Haag and D. Kastler, J. Math. Phys. 5, 848 (64). 

H. Araki, Local Quantum Theory (to be published, Benjamin, New York). 
1. E. Segal, Proc. N.A.S. 57, 1178 (67). 

M. Guenin, Comm. Math. Phys. 3, 120 (67). 

J. Glimm, Comm. Math. Phys. 5, 343 (67); 6, 61 (67). 

K. O. Friedrichs, Perturbation of Spectra in Hilbert Space, Am. Math. 


Providence, 1965. 
25. 
26. 
. O. E. Lanford, Thesis, Princeton, 1966. 

. K. Hepp, Comm. Math. Phys. 1, 95 (65). 

. M. Gell-Mann and F. E. Low, Phys. Rev. 84, 340 (51). 

. H. Araki, J. Math. Phys. 1, 492 (60). 

. G. C. Wick, Phys. Rev. 80, 268 (50). 

. E. C. G. Stueckelberg and D. Rivier, Helv. Phys. Acta 23, 215 (50). 

. R. P. Feynman, Phys. Rev. 76, 749, 769 (49). 

. N. N. Bogoliubov and O. S. Parasiuk, Acta Math. 97, 227 (57). 

. K. Hepp, Comm. Math. Phys. 2, 301 (66). 

. F. Pham, this volume Chapter XV. 

. N. N. Bogoliubov and D. V. Shirkov, Introduction to the Theory of 


T. Kato, Perturbation Theory for Linear Operators, Springer, Berlin, 1966. 
Y. Kato, Prog. Theor. Phys. 26, 99 (61). 


Quantized Fields, Interscience, New York, 1959. 


38. 
39. 
40. 


W. Zimmermann, Comm. Math. Phys. 6, 161 (67). 
E. R. Speer. Preprint, Princeton, 1967. 
I. M. Gelfand and G. E. Shilov, Generalized Functions, Vol. 1, Academic 


Press, New York, 1964. 


4] 
42 
43 


44, 


45 
46 
47 
48 


49, 


F. J. Dyson, Phys. Rev. 75, 486, 1736 (49). 

A. Salam, Phys. Rev. 82, 217; 84, 426 (51). 

J.C. Ward, Phys. Rev. 78, 182 (S50). 

R. L. Mills and C. N. Yang, Suppl. Prog. Theor. Phys. 37/38, 507 (66). 
T. T. Wu, Phys. Rev. 125, 1436 (62). 

J. P. Eckmann, Diplomarbeit, Zurich, 1967. 


. J. Valatin, Proc. Roy. Soc. A 222, 93, 228; 225, 535; 226, 254 (54). 


K. Wilson, Cornell report, unpublished. 
R. Brandt, to be published. 


Renormalization Theory 353 


50. C. Hurst, Proc. Cambridge Phil. Soc. 18, 625 (52); W. Thirring, Helv. 
Phys. Acta 26, 33 (53); A. Petermann, Helv. Phys. Acta 26, 291 (53); A. Jaffe, 
Comm. Math. Phys. 1, 127 (65). 

51. K. Symanzik, J. Math. Phys. 7, 510 (66) and in: Mathematical Theory of 
Elementary Particles, M.1.T., 1966. 

52. J. G. Taylor, J. Math. Phys. 7 (66). 


XIV 


Perturbation Theory in Quantum 
Field Theory and Homology 


JEAN LASCOUX 


Part 1 Normal Form of an Integral From Perturbation Theory 354 
Part 2 Analytical Continuation by Isotopy 362 
** Little’? Isotopy Theorem 369 
Singularity Symbol 373 
Introduction to the Lorentz Invariants Instead of the External 
Momenta 374 
Unitarity Relation 376 
Stratification of an Analytic Space V 379 
Main Isotopy Theorem 38] 
Construction of the Spectral Sequence of Fary 382 
Part 3 Applications 384 
Picard—Lefschetz Formula 385 
Part 4 \Leray—Theory of Residues 390 
Local Division Theorem 39] 
Global Division Theorem 396 
Griffiths’ Proposition 400 
Part 5 Feynman Integrals Revisited—The «-Representation 402 
Borel-Narashiman Theorem 403 
Symanzik’s Formula 404 
Differential Forms and Functions with Algebraic Singular Support 407 
Part 6 Work in Progress 411 
Zeeman Theorem 411 
Lefschetz Theorems 411 
Manin’s Theory 4/2 
Algebraic De Rham Theorem 413 
References 419 


PART 1: NORMAL FORM OF AN INTEGRAL FROM 
PERTURBATION THEORY 


Perturbation theory in classical analytical mechanics has achieved a 
high degree of perfection: one can even treat stability problems. ... 


354 


Perturbation Theory 355 


But it has been fifty years since Poincaré and Birkhoff. The difficulty 
lies in the global change a trajectory experiences under a small perturbation, 
hence, in the “‘ asymptotics.” 

Quantum field theory (if it exists and I shall assume that you are willing 
to accept that it exists in some sense) is two levels away from the world of 
classical mechanics. 

This twofold difference unfolds as follows: First, there is the quantum 
mechanical wave aspect. Partial differential equations describe the pheno- 
mena by means of propagators or fundamental solutions. In some respects, 
the wave mechanics of the n-body problem and its asymptotics are even better 
understood now than the classical cases. On the other hand, the connection 
with classical mechanics is apparent if one looks at the ‘“‘motion”’ of the 
singularities of the solution of partial differential equations. With the princi- 
pal part of the differential operator, one can build a Hamiltonian for this 
motion. 

The second difference is more drastic and dramatic: the production and 
annihilation of particles as the motion develops. Apparently, it is a radical 
negation of the concept of trajectory. Yet, some connection exists, as is 
well known in optics in the case of a reflecting layer with the three rays: 
‘incident, refracted, and reflected.” The singularities of the Feynman integ- 
rals shall lead us again to such a picture. 

I shall refer to Hepp’s lecture for a discussion of the basic features of 
quantum field theory—its vacuum and its operators. Our approach shall be 
most rough and primitive: we shall even have no need of Hilbert space! 
Further, we deal with only one kind of particle and no spin. 

To a given order p in perturbation, the amplitude for ” particles being 
created at x, ...x,, propagating with p interactions, and ending up as m 
particles localized at z, ... z,, is given by the following expression 


GZ) wise Zoe X since Ry) =) Ag 
Ag= [aty, sae d*y, I] D(x; — yj) I] DAZ, — J) I] Dy; — Yd) 


where the different factors of the product are built according to the following: 


Rule 1. Draw a picture of the vn initial points x,...x, and of the m 
final points z, ...Z,,, to be called “‘external vertices’; and of the points 
Y1 .-.p: one for each integration variable, to be called “internal vertices.” 

At each internal vertex y,, draw a fixed number of lines. We shall be 
definite: let us say three (3), in this case, and the picture looks as shown in 
Fig. 1. This is the basic ‘‘ elementary interaction’’; for an algebraic coupling 
with no derivatives, interaction in perturbation theory is nothing more than 
this rule of assigning to each internal vertex a “‘ bouquet”’ of lines. 


356 JEAN LASCOUX 


FIGURE 1 


Now, using these lines, join the internal vertices among themselves and 
join them, too, to the external vertices but with the restriction that a unique 
line starts or ends up at an external vertex. The diagram so obtained 1s 
a Feynman graph. To attach a function to this definite Feynman graph, 
apply the following: 


Rule 2. For each line of the graph introduce the factor D,(x;—y,)... 
between its end points. At this stage, it remains only to describe this Feyn- 
man propagator D,(x). In terms of creation and annihilation operators, the 
D, function plays a miraculous role. It orders the noncommuting free-field 
Operators according to the ordering in time of the points to which the opera- 
tors refer: 


if xo>yo 


A(x)A(y) 
A(y)A(x) 


and yet, because the fields are local, that 1s, they commute for spacelike 
distance [A(x), A(y)] =0 for (x — y)* >0 this chronological 7-product is 
Lorentz invariant. D,(x — y) is simply the vacuum expectation value of this 
T-product. To place this in a proper perspective, I should discuss “‘ General 
Field Theory,” but I shall refer instead to Jost’s basic book and simply write 


(TA(x)A(y)) if x0<yo 


—ik:x 


1 e Q 
oat aa 


D,(x) = 


KX =Kkgxo— k*x 


where € small positive number, and we consider the ky integral first. In the 
complex k, plane, one integrates along the real axis (—0 <ky< +). I 


Perturbation Theory 357 


shall call this integration path hp. On each side of A; we have the two poles 
of 1/(k* — m? + ie) that is, 


ko = +k? + m? — ie = +(w — in) 


where 7 is a small positive number. Hence, we see that if x, > 0 we can close 
hp at “*oo”’ by a large half-circle along which the integrand is exponentially 
small. 

Then we have a “‘cycle”’ and can deform it in the standard ‘‘ small 
circle around the pole” of Cauchy residue formula. Along the lower arc 
(in Fig. 2) e~ ‘*°*° is exponentially small if Im k, <Oand x) >0. Hence for 


Along this arc, e7 '*°*° is 
exponentially small if Im ky <0 and xp» > 0. 


FIGURE 2 


X 9 > 0, the particle propagates as a positive-frequency solution of the 
Klein-—Gordon wave equation 


Note that if one compactifies the complex plane into the Riemann sphere, the 
picture is as shown in Fig. 3, where S? = Riemann sphere of the ky-complex 
plane. 

The asymptotic behavior of D,;(x) in x-space exhibits a remarkable 
behavior for large values of |x|; in the timelike direction, it decreases slowly 
as ~(1/|x?|>/*)e’°; note the oscillating factor e'’"*°, and for spacelike direc- 
tion, we have an exponential decrease ~(1|x?|>/*Je7"“?7!___ As this behavior 
is of fundamental value for the modern proofs of the existence of asymptotic 
states, we shall collect this information in Fig. 4. 


358 JEAN LASCOUX 


FIGURE 3 


FIGURE 4 


To describe the behavior of D,(x) in the neighborhood of the light cone, 
I shall borrow from Bogoliubov-Shirkov the following formula, which 
exhibits the type of singularities the Feynman function has in x-space 


1 1 wm? me) 


be im? 
=e —— — ~ — &(x*) +— 5 | 
ome) An ot 4n?ix? 167 re 877 n( 2 


+ O(,/|x?| In [x?])... 


Perturbation Theory 359 


where 6 and 0 are the usual distributions but here the support of the singularity 
is the algebraic lightcone 

x? = x,7—x* =0 
which is itself singular at the origin (see Gelfand-Shilov Theory of Distribu- 
tion Vol. 1). 

It is about the most complete table of singularities one can think of and 
we Shall retreat rapidly to working exclusively in momentum space where the 
Fourier transform is so simple— 

I 


k? — m? + ie 


Hence, take the Fourier transform of G(z, ...z,,, X,...X,) and substitute 
for each propagator 


De(u; — H;) where pt = X, y, or Z 


the Fourier transform 


e) tkis(Hi- wy) P 
—____—_— (|"k,, 
2 t 
hr ki — m2 : 
The end result is 
l 


G eee i eee SE 
(q, Qin Py Pn) [| (e2 — m? + ie) [] (g;? — m2 + ie) 


1 
x i {TI d*k;; dea I] 3 (y k i & 4 + os Di Ej + 2 atu) 


where the variables p are Fourier conjugates of x, the g are Fourier conjugates 
of y and the ¢,, are the coefficients of the incidence matrix of the graph G. 
The k,; are four-momenta associated with each internal line (y;, y;); Af is the 
chain of integration obtained by applying Feynman’s boundary conditions 
(m? — ie) in order to avoid the set of real points where the integrand becomes 
infinite. We already met this situation when we studied D,;(x). We now 
prefer to express it as a displacement of the chain of integration /, 


instead of a displacement of the polar manifolds 


360 JEAN LASCOUX 


The external vertices are suppressed from the picture: the external lines 
(x;, Yj) Or (¥, 2) are continued to oo and called ingoing and outgoing par- 
ticles, respectively. 

The explicit factor 


1 
[] (2 — m? + ie) [] (a)? — m? + ie) 


is a simple rational function of the external momenta(p,q). It really belongs 
to the “asymptotics” of perturbation theory. I shall refer to Hepp’s des- 
cription of the asymptotic states and on the pretext of its simple analytic 
structure, leave it out! 

Hence, the final form of the integral attached to a graph G shall be 


d*k, A d*k, A+++ A d*k, 
Fe(p) = | ee 


a if a Sra. | 
- I] (> niki + » Nij P,) = m2) 
i= J J 

where r is the rank of G, the number of independent cycles of the graph G, 
(k, ...k,) is a basic set of integration variables after we have taken into 
account all the conservation laws: 5°.) at each vertex. Further, i is now a 
label for an internal line: j= 1... N and 


2 nyk; sg d Nij Pj 


is simply the four-momentum associated with the line / and expressed in 
terms of the integration variables k and the external momenta p, where we 
have suppressed any distinction between the “outgoing” and “incoming” 
particles. 

Assume further that the integral is convergent.* Now, let us turn to an 
analysis of its mathematical structure. 

1. The integrand is a rational differential form with respect to the vari- 
ables kK. Outside its polar locus, the form is closed since it is analytic and of 
maximal rank in C*’. 

2. The polar locus is the union of quadrics 


2 
(y eijki+ Y 4 Ps) —m* =0 
j j 


which we shall denote by S;. These quadrics in C*, with signature 
(+, —, —, —), depend algebraically on the external parameters p. 

3. Compactify C*" into the complex projective space P*’ by adjoining 
the hyperplane “‘at infinity’ IT,,. Denote by S; the projective quadrics. 


* Weinberg, Phys. Rev. 118, 838 (1960). 


Perturbation Theory 361 

Let = be the real algebraic set where the S;, intersect I1,,. Then h, is 

N 
“almost”? a cycle in P*” — |) S,, if we could move it away from the “bad” 
i=1 
set & and close it again. 

As a very rough first approximation the following mathematical 
idealization is of interest: Normal form of an integral (see Fotiadi, Froissart, 
Lascoux, Pham: Applications of an Isotopy Theorem Topology, Vol. 4, 
p. 159). 

Consider an algebraic family Y of hypersurfaces in P” x T, where T is 
an affine space; that is, introducing homogeneous coordinates (x9, X1,..., X,) 
for P” and coordinates (?,, f,, ..., ¢,) for T, the algebraic family # is globally 
given by the product 


S1(x, )S2(x, t)... Sy(x, t) = 0 


where the S; are irreducible polynomials in x and t, homogeneous with 
respect to x. 

Assume that the family is such that for all t, Codp,» Y, = 1. Hence 
for any value of t, we have a nonempty union of algebraic hypersurfaces in 
P", where 


S,=(P Xt nf= ¥ S(t) 
i=1 


Assume that for a generic value of ft, say, f = fg, the hypersurfaces S{t) 
are in general position in P”, that is, their normals are linearly independent on 
their intersections: 


d,S;, A d,S;, °°: A 4,S;, #0 
on 
S;,,058;,0°°° 0 Si, 
for t = fo. 


Assume that for tf = fg, we have a compact cycle in P” which does not 
intersect the S; 
h,, € H,(P" — F,,) 


N 
S 1 = i) Si(to) 


where H,‘(P" — ft) is the n-dimensional compact homology group and h,, 
denotes both the compact cycle of integration and its homology class. 
Then consider the analytic function defined by 


ee 


"TL SiG, t) 


F(t) = 


| 


362 JEAN LASCOUX 


where w/IIS;(x, t) is an analytic differential form in the x variables closed 
outside its polar locus. We shall call this “‘the normal form of the integral ”’ 
and develop in the following the rather smooth mathematical treatment one 
can give to our idealized model. But, we shall have to remember that a large 
part of the physical content of Feynman integrals was cut off with our neglect 
of the bad set & and of the real algebraic character of the integral (we have 
carried out a too easy complexification and disregarded the fact that our 
integral was first given as an integral over the rea/s of a differential form where 
the complex numbers enter only with Feynman’s prescription: m? — ie). 


PART 2: ANALYTICAL CONTINUATION BY ISOTOPY 


I shall refer to the pair (Y x T, S) as “the family.” The ‘“‘theorem”’ 
I want to talk about is obvious (I was told so) for the main application we 
made use of. Yet, to understand it, we employ such heavy machinery that it 
is a pity the question should stay in such state. The details of the machinery 
were all borrowed from what I shall call Thom’s main isotopy theorem. Then, 
the proper aim of my lecture should be to end by stating this theorem, and 
taken as a whole, it is only a biased introduction to this circle of questions. 

Let me introduce a convention: Under the heading “ Recall’ I shall list 
notions, mainly topological, into the exact definition of which I do not wish to 
enter. 


To begin: 
Recall that a fiber space (EF, T) is locally a product 
Emw—X x U 
{2 
T<«— U 


with p, the projection on the second factor. The local isomorphism will 
always be a homeomorphism, the only exception being a particular case of 
mixed Structure that we need: 


| eR 


(A 


where E, T, and X are C®-manifolds, g(x, t) being differentiable with respect to 
t and satisfying the condition that for any vector field Z, C” on 7, hence on 
X x T, g,Z exists and is a locally Lipschitzian vector field on E. 

Definition. A vector field V on E is Lipschitzian if 


Perturbation Theory 363 


for y,y’€ E. Inordinary words g(x, t) is differentiable with respect to tand a 
little more than continuous with respect to x, as could be expected from 
homeomorphisms generated by integrating along vector fields. 


Recall: 
SFaE 


¥ 


is a fiber pair with base space T (see Spanier, p. 265, Algebraic Topology), if 
locally 


(E, P)<5-(X xT, SxT) 
TT 


T 


P2 


is a commutative diagram. In ordinary parlance, there exists a common 
local trivialization of the two fiber spaces Y c E. 

Then our main application shall be, referring to the conditions I gave in 
the definition of normal form, “there exists an algebraic set in 7, called L (for 
Landau) such that (E, Y)7,_, 1s a fiber pair above 7 — L. Our main goal will 
be to describe the (E, Y) above L by further extraction from it of such fiber 
pairs: (E, S),-,,, where L, is an algebraic set in L, (E, Y),,-1,..-- 


Recall. An isotopy map (CX, So) *(X, S,) is a homeomorphism such that 
there exists a continuous family of homeomorphisms 


G={Ohrer:X X17 X 


such that if we put Y = trace of Sp under G, then (X x J, ),, defines a fiber 
pair with the restrictions g = g, and Idy = gp. 

Why is this concept relevant tous? Leth be a cycle representative of a 
homology class H,° (¥ — So). When we move the parameter continuously, 
we want to define, when it exists, a corresponding continuous motion of h; 
that is, we want to make sure that at the homology level 

H,(X — So) —> A, (X — S)) 
g* 
h survives, that is, g, #0. There are two ways to attack this: 

1. Subdivide / into simplexes (small), move them, keeping them away 
from the boundary of X — S,, then fix them again to obtain a cycle repre- 
senting g,h. This is the method of Nilsson and Leray second manner. It is 
also the physicist’s way, before we start reading Leray and Thom. 

2. Move the whole work, g being an isotopy, g, is thus an isomorphism, 
hence taking care of all the cycles simultaneously. The ‘whole work” 


364 JEAN LASCOUX 


denotes in an imprecise manner as many interesting topological invariants as 
you can put your hands on. For instance, it could include the singular 
complex (S(X), S(S)), the requirement that g, or g* be an isomorphism, the 
immersion S€»X and its invariants.... We shall choose this second 
approach. 

How to construct an isotopy? The answer is clear from Professor 
Bott’s lecture—through vector fields. But we want to generate by integration 
along this vector field a group of homeomorphisms. 

The maximal generality we shall require (although unnecessarily 
restricted according to Thom) is described as follows: 


Lemma 1. (from Milnor’s ‘‘ Morse Theory’) Let F be a vector field on 
a C*-manifold X¥. Assume Supp F compact and F € Lip. Then F generates 
a one-parameter group of homeomorphism y(t). 


It is a well-known lemma in ordinary differential equations 


dy(t, y) _ 
sa EON) 


Lip ensures the existence of a unique local solution and the Lip-dependence 
on the initial conditions. 

Compact: You can do away with patching a finite number of neighbor- 
hoods without losing too much. 

What requirements ought our Lip-vector fields to meet: the way I think 
best is to look at the simplest situation where ¥ is irreducible. Hence, look 
at one manifold, S, embedded in the ambient space YX. 


Recall. Tangent bundle of S: It is the vector bundle of all vectors tangent to 
S, noted 7(S). 

Normal bundle: Choose a Riemannian metric at least in the neighbor- 
hood of S. Take the normal vectors to S: N(S) is again a vector bundle 


N(S) + T(S) = T(X) 


Tubular neighborhood of S: \t is the union of the geodesic normal p- 
balls 7(S) if S is of codimension p. 

Repeat: Howto move S$? Take a global section of M(S), and integrate 
the corresponding vector field. Note that global sections of T7(S) generate 
homeomorphisms of S into S. They are very interesting although they don’t 
““move S” in X. If you can find better and better vector fields such as 
k-differentiable, C”, holomorphic, when it makes sense, then your notion shall 
be accordingly k-differentiable, C®, holomorphic. 

Note now that the R-metric is unessential. The important concept we 
are after is transversality. 


Perturbation Theory 365 


Recall. Two manifolds V, W are said to be transversal to each other if along 
V AW, TV) + TW) = T(X) (or in general position if we deal with hyper- 
surfaces: linear independence of the normals). Denote this transversality by 
V %W. Returning to our family affair, let us make all these notions of 
tangent bundle and transversality relative to the mapz: ¥ x T+T. Denote 
the topological product space X x T by E. 


Recall. In the tangent space TE,,., there are two kinds of vectors: the 
vectors tangent along the fiber X: vectors called vertical with their linear 
space denoted by V;,, ,, > T(S,),, and the remaining vectors grouped into 
equivalence classes 


XxX, ~ X, iff x, X, = 1, X> 


Call Hi,» the corresponding vector space of equivalence classes. It is a 
good concept to define a motion compatible with the map 7, that is, sending 
fiber into fiber. 

Note that for X x T, everything being manifold, we have trivially 


0-V-+TSH+0 (exact sequence) 


at each point, the sequence splits. If we want to, we can make the splitting o 
global along the fiber (X), because our family is trivial. 

Similarly, one can define the horizontal vectors H@, , of the tangent 
vector space TS, | 1). 


Recall. o is a connection if one can choose in T a horizontal subspace con- 
tinuously along the fiber. Let us returmto a fiber of our family but now take 
the next stage of complication S=Y, US, (but FY, FSP,) S=S, FS, 
(but S, # S,). Wecan seek to factorize the simultaneous motion of the two 
manifolds into steps where only one manifold is displaced, the other being 
restricted to be invariant—To construct such a restricted motion, the “‘ germ”’ 
is again in the almost infinitesimal tubular neighborhoods of S, and S,. 
Choose the R-metric in such a way that S, is normal to S,. To move S, into 
S,’ we look at the normal disk N, at xe S,. Assume that S,' meets the 
disk at a unique point x’ (Fig. 5). The map x-w.-x' is given by the implicit 
function theorem which can be applied by virtue of the transversality. 

In the disk, the picture is as follows: We have an R-metric, hence 


geodesics. Let (x, x’) be the unique geodesic joining x to x’. Extend the 
geodesic field along (x, x’) into a continuous vector field which vanishes near 
the boundary of the disk (Fig. 6). Such a vector field 1s tangent to S, ina 
neighborhood of S$; AS). 

By integrating such a vector field, we obtain an homeomorphism of 
Z(S,) onto some 7(S;,’). We have still the fact that S$,’ - S,. Repeat the 
same procedure to move S, into S,’. Note that S$, nN S,-7S,'’ AS, > 


366 JEAN LASCOUX 


FIGURE 5 


S,' A S,’.. The motion was therefore “in” the normal bundle of the inter- 
section, which is of complex dimension 2. Let X, be the first vector field 
(LS,) and let X, be the second vector field (LS,). The principal use of the 
tubular neighborhood of the intersection S, ~ S, is to show the place where 
some plumbing has to be done—look at the boundaries of the tubular neigh- 
borhoods and at their intersections. Note that 07(S,) ® 07(S,). 


Define. A plumbing function g (Thom prefers to refer to g as the carpeting 
function, so we shall use this name) as a C® function on X — (S, U S,) posi- 
tive on X — (S, U S,) and such that g has a C° extension to XY — (S, U S,) 
satisfying g~ '(0) = S,; U S,. Require also that, near the boundary of the 


FIGURE 6 


Perturbation Theory 367 


open manifold X — (S, U S,) (this open manifold is called a stratum), the 
differential of g does not vanish. 


dg #0 


Define. js, the retraction of 7(S,) over /, obtained by shrinking the geodesic 
normal disks of radius e«. We shall use it for Leray’s definition of the cobord 
map, that is, to lift cycles from S, to the ambient open tubular neighborhood 
TF (S\) — Sy. 


Define. K,s,-~s,,s,) 48 the corresponding retraction obtained by letting the 
radius € tend to 0 as we come close to the submanifold of S,, S; 0 S,;.. Now 
repeat the construction to define Kis, _-5,,5,) - 

Generalize: the corresponding structure when the S,(x, t) are transversal 
is easy to visualize. We have a set of open manifolds V,, such that 


This allows us to define an incidence relation: V, > Vz if Vz is in the bound- 
ary of V,. Oneach V, there exists a carpeting function g, and a retraction k, ; 
the kind of differential C —- W complex so defined has for attaching map the 
ky, retractions which must satisfy a condition of compatibility: 


ky lve = ky |v, fo) kv aly if V, < V 5 < V 


The following lemma is then relevant to the problem of constructing the 
composite of two maps if there is a further restriction to meet along the 
motions, such as leaving a closed subset invariant. 


Lemma 2 (Chevalley). Let & < X be a closed set, X,, X,¢€ Lip*® such that 
the corresponding groups of homeomorphisms jy,, y, leave & invariant. 
Consider X = X, +X, then X generates a group of homeomorphisms y such 
that yi c 2. 


The usefulness of this lemma is that we can now localize our construc- 
tion of vector fields and using partitions of unity patch the local vector fields 
in the intersection of the small neighborhoods where they are defined (Fig. 7). 


HINT. Professor Helgason did the main algebraic job if you define 


ne = im (n(7) re) 


Use the algebraic identity 
og" — p" = », oo | a a” B | la 
Let S,(x, t) = 0 be the global equation of the hypersurface $;. Consider 
a neigborhood U of (x, t) such that 


UNS; #6 forieI c[1---n] 


368 JEAN LASCOUX 


FIGURE 7 


Then if x = (x,,..., x,) and ¢ = (t,, ..., t,) denote local coordinates, what we 
are looking for is a vector field with vertical components X = (X,(x, f), ..., 
X,(x, t)) and horizontal components T= (7,(t),...,7,(¢)), such that the 
vector field generates a group of homeomorphisms leaving ¥ invariant and 
sending fiber into fiber. This is already taken care of by choosing T inde- 
pendent of x. As for # being left invariant, choose the vector field in U such 
that 
ds, = a, +rS > diy = 0 ie! 
for 
(dx, dt) = (X, T) dt 


over the locus (S; = 0) i€ J and such that similarly for the other partial inter- 
sections (NS;);-r¢ 4S; = 0 for ie I, over (S; = 0);-;,. Note that we are con- 
sidering the natural decomposition of U into strata by the (S,) and construct- 
ing a vector field tangent to these strata. Finally in the stratum U — |) S;, 
iel 

there is no condition on the vector field except that the different choices 
should match as smoothly as possible since we want to solve the differential 
equations of motion 

dx dt 

—= X(x,t —= T(t 

dt oo) dt TH) 
Assume that a neighborhood of each point along a fiber (X x f)) exists 
such that we can find our Lip vector field leaving Y invariant. This means 
that for (x9, to) € X X to, there exists U such that (Y x T|y, S| y) isa fiber 
pair with map p,, the projection over the second factor. 


Perturbation Theory 369 


Lemma 3. If X x T> Tis proper then the two conditions are equivalent 

1. (X x T, SF) 1s a fiber pair. 

2. For any (x, ¢), there exists a neighborhood U such that (¥ x T|y, 
Ff | y) is a fiber pair. 


The term z proper implies X compact, which implies in turn that one can 
cover a neighborhood in X x T of a fiber with a finite number of neighbor- 
hoods U;, hence (| x(U;) is a neighborhood of fo in T. 


Then use partitions of unity and Lemma 2 to conclude the proof of 
Lemma 3. 


*‘Little’’ Isotopy Theorem 


If, for t= fg, the hypersurfaces {S((to)}ieri...n) intersect transversally, 
then there exists an algebraic set L of codimension 1 such that 


(X x T| 7-1, SA lr-1) 


is a fiber pair. 
The linear system over S,; = () S; 


iel 


r OS; 
pea +2 ano 
iel 


has a continuous solution where one can choose T equal to a unit vector if 


OS; 
rk =|I| over S, 
y Jj 


One can then solve by Cramer’s rule the linear system with an “‘ inhomoge- 


neous’”’ term: 
os; oS; 
a Pe es By 
F- 7 EA _ (A) 


and this describes precisely the lifting of the vector field Tfrom T into X that 
we are seeking for our special type of fiber pair. Note that these determinants 
call for an interpretation in terms of Grassman coordinates. We have the 
following features to determine X from our linear system of equations (Fig. 8): 

1. An inhomogeneous term. 

2. A different number of equations according the number of S; which go 
through the neighborhood in X which we consider. 

3. The equations depend on parameters te 7. For |/| =r, we have 
r equations for the r unknowns (X,(x, t)); we have no freedom of choice: move 
along the solution x; = x,t) given by the implicit function theorem. 


370 JEAN LASCOUX 


FIGURE 8 


For the general case, it is easy to guess what arbitrariness we have to 
face: the kernel of [0.S;/0x,] is given by the vectors tangent to S,;; hence they 
generate homeomorphisms of S;. 

To end the proof of this “little”? isotopy theorem, let us study the 
algebraic set L< T. To simplify, | take ¥ = P’~'. Let L, denote the subset 


of T such that 
oS, 
rk =| < |I| 
Ox; iel 


Consider the ideal (called a Jacobian extension) 


H,= (se. thie l, det, =*)) 
Ox; 
where det,[0S;/0x,;] are the determinants of all minors with |J| lines and [J| 
rows that one can extract from [0S;,/0x,]. Call the set of its zeros (the asso- 
ciated algebraic variety) L,. Then elimination theory (or projection parallel 
to the fibers) tells us the following: 


Lemma 4, Let ]=(l1,..., m); consider the system of polynomial equations, 
homogeneous with respect to X 


S.(x, t)=0...S,(x, t) =0 

and (n — m) independent equations 
D(S, ...- Sm) _ 9 

D(x;, --- Xi.) 


ae iff 
ir ea oe «<——>| The resultant R,(t) =0 


(B) 


of the form 


Perturbation Theory 371 


For the proof, refer to VanderWaerden. For the subfamily 
S = () fF; 
ie! 

the subvariety of T associated with the principal ideal (R,(t)) is of codimension 
| and can be interpreted as the ‘‘ apparent branch” of the manifold Y, and the 
projection of its singularities on 7. The apparent branch corresponds to the 
points where the tangent plane to , becomes too vertical, that is, (recall: V is 
the vertical tangent subspace to X x T) 


dim V qn T(¥,) = dim X — [I| + 1 


Now there is a very important map in the sense of algebraic geometry to con- 
sider: from the obvious remark that if te L,, we have R,(t) =0, then in 
Lemma 4 we can use the arrow from right to left and consider the corres- 
ponding solution x = x,(t). We shall call this algebraic map «. 

If one obtains a point, the singularity of S, at (x,(t), t) is called local. 
But it can happen that e(t) is an algebraic set in Y,|,. In the simplest case, 
this means that the hypersurfaces in_X, {S,(t)},.,, are tangent along some sub- 
manifold. If 2, is the projection map restricted to Y,, then «, is a kind of 
algebraic section of m, over L; since m,;-&,=Jd,,. One should think of 
these maps and ideals in a purely algebraic context (Hironaka). They 
could be used to describe what generalizations of the quadratic Picard- 
Lefschetz formula we should look for. But we shall disgress for an immediate 
application. The contracted Feynman diagrams as ‘‘ symbols”? of singularities. 

Consider the family of quadrics, corresponding to a diagram G, as 
algebraic hypersurfaces in C*". 

Let S; =q;7 — m* = 0 be one of them; then 


aS; _ 


is the normal to S; for fixed external momenta. Recall that 
qGi=> ek +) eur 


Hence, the lack of transversality for the hypersurfaces (S;);., implies a linear 
relation between the normal vectors (in C*’), 


dma = 0 


and at the same time the mass-shell equations 


372 JEAN LASCOUX 


It is very convenient not to represent on the diagram G the hypersurfaces 
which do not participate in the singularity L, by contracting the corresponding 


internal lines. 

The graph G, so obtained, with its generalized vertices resulting from 
contraction, 1s now thought to mean: 

1. the lines of G,; are momenta on the mass-shell. 

2. for any closed loop C of G,, there exists a linear combination between 
4-vectors in C* 


We shall attach a function to G,, called the period around L,, after we have 
discovered how the function F,(t) is ramified when one goes around L, in the 
parameter variety T. 


Example 


Let us consider the simple case of the scattering of two particles, in the 
fourth order of perturbation. The Feynman diagram is shown in Fig. 9 and 
the corresponding amplitude 1s 


F(P1; P25 P3> Pa) 
-| 0, 
np (k* — m?)[(k — pz)? — m7 ][(k + p3)? — m7] [(k — p, — p2)? — m?] 


FIGURE 9 


Perturbation Theory 373 


\ 


1234 uc 


FIGURE 10 


Singularity Symbol 


The contraction procedure gives rise by iteration to the following 
hierarchy (only part of which is pictured in Fig. 10). This illustrates the 
different cases where we have to apply Lemma (4) to build the different com- 
ponents of the Landau set. Now, there exists a very pictorial way of des- 
cribing the type of “‘ algebraic relations” that elimination theory provides us 
with. Recall that p, +p, +p; +p4=0. Hence the picture should be at 
most three-dimensional since even for the internal momenta q;, there exists at 
least one linear relation among them; using the fact that we are interested in 
** algebraic relations,” substitute the Euclidean metric for the Minkowskian 


Pa D3 


P2 


FIGURE 11 


374 JEAN LASCOUX 


metric since one has only one quadratic form nondegenerate in C* over the 
complex numbers. Hence [J corresponds to the packing of four spheres, 
shown in Fig. 11. They are centered at the four vertices, with equal radius m, 
and meet at the point 0! 


Introduction of the Lorentz Invariants Instead of the 
External Momenta 


This last picture makes clear that the singularities are Lorentz invariant. 
Recall the Hall-Wightman theorem. (I shall refer also to Ehrenpreis’ 
separation of variables.) Let s =(p, + p,)* and t=(p, + p,)? (in Fig. 11, 
they are the two other edges of the tetrahedron). If you put u=(p, + p3)’, 
and if the external momenta are also on the mass-shell, that is, p,;7 = m?, 
where i = 1, 2, 3, 4, then 


s+tt+u=4m’? 


and, taking as parameters {s, t} e C*? = T, one can calculate the Landau set. 
It splits into several irreducible components, among which (see Fig. 12): 


Li234 St((s — 4m?)(t — 4m?) — 4m*) = 0 | 


l 3 
Li23 -S(s — 3m”) = 0 /_\ 


Liz s(s—4m*) =0 <> 


FIGURE 12 


For two particles scattering, the experimental parameters are 


s=(p,;+p2)* ‘(Total energy in channel I) 
and 
=(p,;+p4)? | (Momentum transfer in channel I) 
While channel I denotes one of the three famed physical regions represented 


on Fig. 13, it is also the region ®,, where the total “energy” 
s =(p, + pz)” =(p3 + pg)’ is larger than the rest mass squared of the two 


Perturbation Theory 375 


FIGURE 13 


ingoing particles as well as of the two outgoing particles. For simplicity we 
have taken all the masses to be equal. 

The forward scattering (cos@,.=1) is t=0; backward scattering 
(cos 0, = —1) is given by t + s = 4m’, and the physical region corresponds to 
s>4m? and —1 < cos 0, < 1. 

I have plotted all the Landau curves. Note that they are real algebraic, 
and that some components (s = 0, t = 0) occur with multiplicity. Note also 
that for the specialization to the equal masses case, the Landau curve II 
decomposes. In general (for arbitrary masses), we should expect the Landau 
curves to be singular, either from topological considerations, since they are 
apparent branches, or from the algebraic remark that the equations R(t) are 
discriminants. One applies elimination theory to a set of functions which 
are partial derivatives of a unique function; therefore, the functions of this set 
are not “‘ generic.” 


Example 


Consider the following X* + pX + q = 0: the discriminant 4p? + 27q? = 
0 has a cusp. 


376 JEAN LASCOUX 


To return to the amplitude F,(s, t), an easy analysis shows that the 
branch we are interested in is not singular for Res < 4m? and Ret < 4m? and 
therefore real analytic in this region. By continuity, one deduces that this 
branch is not singular for any of the components of the Landau set which 
enter the tube (Res < 4m?, Ret < 4m’). 

The only components which are relevant are the “normal thresholds” 
at which begin the physical regions ®, and ®,,, that is, 


s = 4m? 
t = 4m? 


The ramification around s = 4m? can also be interpreted as the imaginary part 
of the F;(s, t) s-plane (Fig. 14). Since for t real < 0, s < 4m’, then F,(s, t) is 
real. Hence, fort <0, S> 4m? 


Disc F,(s, t) = F,(s + ie, t) — F,(s — ie, t) = 2i Im F,(s, t) 


F,(s + le, t) 


F,(s — ie, t) 


FIGURE 14 


From a general principle, the unitarity of the scattering matrix, one can calcu- 
late this imaginary part. 


Unitarity Relation 


6, (k? = m7)5 (Py + P2 — k)? a m’) 
((k + p2)? — m? — ie)((k + p3)? — m? + ie) 


Note that we have only two propagators left and with different prescriptions 
for the ie. 


The two 6,-functions mean 


Im F<(s, t) = i d*k 


s>4m2,t<0O 


k? = m? ko > 0 
(py, + pp — k)? =m? Pio + Pro — kyo > O 
Hence the integration domain is really now a two-dimensional sphere S? 


intersection of the two hyperboloids pictured in Fig. 15. Applying Cauchy’s 
theorem gives the following dispersion relation (keeping t real and less than 0) 


_ 1 Im F,(s’, t) ds’ 
OR a ae 


Perturbation Theory 377 


(pi +p. —ky =m? 


Ik] 


FIGURE 15 


The remarkable feature of these two concepts of unitarity and dispersion 
relations is that they hold independently of perturbation theory and can be 
derived from basic principles in axiomatic field theory. A third concept 
stems from the remark that the function F;(s, t) being analytic except on the 
Landau set which is of complex codimension 1 can certainly be analytically 
continued from the region ®,, where this amplitude describes the scattering 
of two particles with total energy s to the region ®,. For the determination 
of F,(s, t) which we have chosen, this 1s very simple. 

It is remarkable that the same function F,(s, t) describes by analytic 
continuation to ®, the physical process associated with this region precisely: 
the scattering of two particles with total energy t. Again, for s < 0, we have 
the dispersion relation 


Im F,(s, t’) dt’ 


1 
Fos =s | '—t 


2ni 
with the unitary relation for ®, (s < 0, t > 4m”) 
2i Im F,(s, t) = F,(s, t + ie) — F,(s, t — ie) 
= [atk UE Pa) — mb ((k + Ps)" =u) 
(k* — m* — ie)((py + pz — k)* — m, + ie) 


Again, this property of “‘ crossing”’ from ®, to ®, can be proved for the two- 
particle scattering from the axioms (Bros, Epstein, Glaser)! I shall end this 
disgression by stating a property which is true for F,(s, t) but is not proved in 
perturbation theory nor in axiomatic field theory. 


378 JEAN LASCOUX 


The determination we have chosen for F,(s, t) is analytic in the product 
of the s-plane by the f-plane minus the two cuts 


s=4m? + p p>0 
t=4m?+p'  p’>0 
Hence applying Cauchy’s theorem twice, we have 


ea t') ds’ dt’ 
a (s’ — s\(t’ — 1) 


where p(s, tf) is a double iceman 


F,(s, t) = eS a a 


p(s, t)~ F¢(s + ie, t + in) — Fg(s — ie, t + in) 
— F,(s + ie, t — in) + Fg(s — ie, t — ig) 


and Supp p is the region above the branch of the Landau curve called (7), 
since this double-integral representation is known as Mandelstam’s represen- 
tation. 


Exercise. Does it make sense to write the following? 


p(s, t) = [ d*kd ,(k? — m?)5,((k + p2)? — m’) 
x 54((k + ps)? — m?)6,((p; + pz — k)? — m?) 


In the integral formula (/), one of the integrations is elementary, but the last 
one Is then an elliptic integral. 

We have thus finished the list of properties of this fourth-order example. 

In the little isotopy theorem, we prove only that outside L, on the “‘ big” 
residual space T — L, the mapping x was a fibering map for the fiber pair 
(X¥ x T, S)r_,. It would be useful to know more: that is, can we break 
L = UL, into pieces U, (open manifolds) such that on each of these pieces U,, 
m|y, is the fibering map of the fiber pair (XY x T, |y,)? To find criteria 
in order that there exists an isotopy map between (X, S,) and CX, S,) when 
S, and S,, have no more normal crossings in X is very closely related to the 
question of equivalence of singularities. 

Thom’s isotopy theorem deals directly with the existence of such 
fiberings. To show the main steps, I shall have to recall a new series of 
notions. 


Recall. If fis a proper map between two analytic spaces X Ls S, let df be the 
tangent map; then, there exist two sequences of analytic subspaces (we shall 
call such a sequence a stratification) 

XDX,;7X,9°::D KX, 

S>S8,> 8,5°::>5%, 


Perturbation Theory 379 


such that 
X, = {xe X|rk df. <m-—v} 
where 
m = sup rk df, 
xexX 
S, =f(X,) 


From this, one deduces that, on the set XY, — X,,4,, the rank of df is constant. 
(This property can also be formulated in the C®-case.) 


Recall. X, = Xy41 
f\xXu- Xue f\Xu-Xvss 
S, _ So+1 


is a fibering map since it has constant rank over S, — S,4,. 


Recall. A point is critical for a map if the map df is not surjective. It is 
clear that such a stratification (this “locally constant rank”’ stratification) 
should enter our picture if we want to go above the Landau surface L; for our 
family (X x T, S) and its projection map 7, the formulation runs as follows: 
X x T-T being trivial, only the stratification of the map n|,:S—>T is 
interesting. We have already introduced the subfamilies Y, of SY. The 
stratification of the /, according to the rank of x|/, shall be dealt with and 
illustrated by Pham in the next chapter. 

I want to emphasize that, from the point of view of the isotopy theorem, 
what we want to establish is a common trivialization for all the manifolds 
which occur in the fiber. Therefore, when the hypersurfaces have no more 
normal crossing, we shall have to separate out the singular locus of their inter- 
section. Hence the stratification of the fiber shall be augmented by the addi- 
tion of new “‘critical strata.”’ 

The relevant notion here is the regular stratification of an analytic set. 
I shall follow Whitney. 


Stratification of an Analytic Space V 


Definition. Partition S(V) of V into subsets M which are manifolds such that 
for all M e S(V), M is an analytic subspace such that 


M—-M= |) M’ 
M’eS(V) 
M’'aM#o 


Definition. A stratification is regular if the following holds: 


380 JEAN LASCOUX 


(A) Let peM,;<M,. Each plane Te t(M,, p) = {SET OF LIMITS OF 
TANGENT PLANES AT M , AT POINTS g € M,; ASq > p} contains 7(M)),. 

Now, to define a second set of limit tangent planes, let y,;; be the holo- 
morphic retraction defined by the plane transversal to M; through ge M;; 
it cuts M; at a point denoted y,(g). Let us denote by n(q — y,,(q)) the 
direction defined by the vector q — 9;,(q). 

Consider 1(M,, M;, p, B) equal to the closure of the set of tangent 
planes to M, at points g as g > p in such a way that 


n(q — yi(9)) > B 


(B) Each plane Te 1(M,, M,, p, B) contains 8, for all B. 

Note that by taking all directions of approach to an analytic manifold 
M from the ambient analytic space V, we replace the normal bundle of M by a 
fiber space over M having as fiber a projective space instead of a vector space. 
Call it M. It is then possible to replace V by an analytic space V such that 


fot 


and z is an analytic isomorphism of V — M with V — M. 

This process of replacing M by the fiber space M and fixing back to M 
analytically (V — M) to obtain an analytic space V is known as blowing-up of 
V with the center the submanifold M (“‘ permissible monoidal transformation” 
is the name used by Hironaka). 


Exercise. Let M; be in the boundary M;: M;< M,. Blow up M, into M; 


using the normal bundle of M;,;. Transform M, into V by means of the 
-1 _ 

isomorphism (z). Take the closure of n(M,) in M —to what properties 

do conditions (A) and (B) correspond? 


A regular stratification exists for an analytic set (Whitney theorem). To 
dissect (Y¥ x T, #), into the union of fiber pairs, we should construct a 
refinement of the “‘constant rank”’ stratification such that: (1) The new 
stratification is regular; hence both (XY x L, Y,) and L are stratified and their 
strata satisfy (A) and (B). Above 7 —L, we already have a fibering map. 
(2) x is a proper stratified map, that is, it sends each stratum of X x T ontoa 
stratum of 7. (3) If the pair of strata U > Wis sent into the pair U’ > W’, 
then we shall require a “‘ vertical’’ property (A) (see Fig. 16): If pe V and 
qe U, 


limV nA T(U),>E AT(W), 


qa~p 


Perturbation Theory 381 


FIGURE 16 


where V denotes again the tangent vector space vertical with respect to the 
map 7. 


Main lsotopy Theorem 


(See Thom’s lecture Theorem 1.) Let 2:(X xT, Y),7L be a 
surjective stratified morphism. Let Y bea stratum of ZL and Ua small open 


-1 
neighborhood in Y, then xU equipped with the induced stratification of 
(X x T, S), is isomorphic to a product of U by a stratified fiber (XY, S). 

Thom proves this theorem by changing (A, B) by his own construction 
of carpeting functions and tubular retractions which are better adapted to the 
construction of vector fields and of isotopies. On the other hand, the proof 
of the existence of the stratification in the complex analytic case is more easily 
expressed in terms of (A, B). Using this theorem, we can now think of the 
family (Y x 7, #) asa collection of fiber pairs above the strata into which the 
parameter variety T is decomposed by the stratification of the map z. 

The main difficulty for the applications 1s that we have no efficient way 
of constructing the stratification for the family (Y x 7, #). It would be very 
useful to have algebraic criteria to dissect the Landau set into strata having the 
differential properties expressed by (A) and (B). One expects then that each 
stratum could be constructively defined by a “‘symbol,”’ a refined version of 
the contracted diagram we have used to denote the different components of 
the Landau set. 

If this stratification were done, we could then proceed to look for the 
relations between the fiber pairs above the different strata of the parameter 
variety. 


382 JEAN LASCOUX 


The situation is very similar to the Lefschetz hyperplane section theorem 
for an algebraic variety of dimension n when one searches for the homology 
relations between the vanishing cycles in a generic section. On one hand, 
only the invariant cycles of dimension n — | are relevant for the variety since 
the vanishing cycles are homologous to 0 on the variety; on the other hand, the 
relations between the vanishing cycles are connected with the existence of 
cycles of dimension n for the variety. 

Hence [ think it will not be too misleading to give as an application of 
the Thom’s isotopy theorem a spectral sequence due to Fary. I am grateful 
to Professor Steenrod for the following presentation. 


Construction of the Spectral Sequence of Fary 


Assume that a mapping f: X¥ > Yhas the following properties: there is a 
filtration Y, < Y, <:::< Y,,= Y of Y by closed subsets, and for each k, if 
X,=f7' Y,, then fly,-x,., Xe— Xn-1 7 Ye—- Yy-1 is a fiber bundle 
(locally a product) with fiber F,. To simplify the presentation, assume also 
that Y is a complex and that each Y, is a subcomplex. 

Let Y? denote the p-dimensional skeleton of the complex on Y, and 
set Y? = f~'Y?. We have then two filtrations of ¥; we shall combine them 
into a single filtration and consider the associated spectral sequence. Let 
X,? = X’ a X,, and define the rth term of the new filtration to be 


(2), 
—2i 
rX = Ux 
t= 


A picture (Fig. 17) is useful in keeping track of the various sets. For example, 
the heavy stepped line denotes the upper boundary of ,X. The associated 
spectral sequence of this filtration of X 1s now well defined, and it will con- 
verge to the cohomology of X¥. The main point is to show that the F,-term 


has the structure 
(p/2] 
ERt = Yo HP, Yas HF) 
i=0 


and that the d,-operator is a sum of two operators, one of which is that of the 
Leray—Serre spectral sequence of the fibration (X, — X,-,) 7 (Y, — Y,-,). By 
definition, 


EY" = H?*4 xX, p-1X) 
Since ,X — ,-,X is the union 


[p/2] 
U (xP XPFEN A(X = Xia) 


| S555Re 
\\\\\\ 
a 


i222 oa 


‘Si p—- 
Y, and f| X; — X;- a fibration, ide ntitythe'c ohomology of the ith 
term with the ech nae up C?- BY. ea HAM BY), 
Then, 
(p/2) ; . 

Eft = 2 c’-*y,, Y,-,; H7*7(F,)) 
One now computes d, a Ga tee ochai as 
seat onee das ects dto hav etwoc ee ents, one for ok, of the si atio 
but in our case the ane oe two nh ue it fit ie n makes the vertic - 
component be: However, the cal component of d, will n ot be 


zero. 
In fact the differential d, is the sum of two differentials d’, d”, wher 


d": H’-*(Y,, Y,-,; H1*7(F,)) 


H?*?-2Y,, Y¥j-1 3 H9**”'(F))) 


384 JEAN LASCOUX 


is the differential of the filtration 
PF, er tee fia fi 


and 
d’: H?-7*(Y,, Y;-, ; H?*7"(F;)) 


He-**10¥,.., Ye 5 HF, 41) 


is the differential from the exact sequence of the triple (Y;4,, Y;, Y;-,).. The 
limit of the spectral sequence is a graded group associated with the filtration of 
H*(X). Hence, we have gained some information on the relations among the 
different fiber pairs when they are matched together to form the total space X. 
Needless to say, I know of no specific calculation. 


PART 3: APPLICATIONS 
Let us consider the following application of the isotopy theorem. 


Lemma. X is a topological space; S is a subspace i: SC. X. Assume: 
V K compact in_X, there exists a map o, which is an isotopy of X such that 


SC“%, xX-—K 
then the map 
i* : H(X) + HS) 
is the zero map. (Notation: we shall use from now on the mathematical 
symbols: J = there exists and V = for all, ...) 


PROOF. Take h to be a representative of a compact cohomology class in 
H.(X). Then 3K such that Supp hc K. Let ix be the inclusion map 
OxS ©+-X— K. One has ip*h = 0 from the commutative diagram. 


3 
H"(X)—» H"(S) 
Id | oi 
ix 
Hé(X)— > Hi(0,S) 


Note that a much weaker assumption than the existence of an isotopy is 
sufficient for the conclusion. 


Corollaries 
1. S,, S, are closed subspaces of X 
Assume : VK, Jo,’ such that 
il S; —» X —K 


is S;—  S; 


fori,j=1,2 


Then 
AMX _ Si = S>) = AMX = S2)® 6H2-'(S, a S>) 


Perturbation Theory 385 


PROOF. S, — S, 1s closed in X — S}. 
Hence we have the exact sequence: 
#52 HIS, i S,) > HX —S,—-S,)7 HX — S2) 
Ft AMS, — S,)>°°: 
Now, generalize! 


2. Let X be a complex manifold, dimg X¥ = 2n. S,, S, are two complex 
submanifolds of codimension 1; hence dimp S; = 2n — 2. 

Assume: (a) There exist isotopy maps o,', and (b) S, and S, are 
transversal S, # S,. 

Write again the previous exact sequence for a closed set. One obtains 


—+ Hi '(S, = S>) 5H “X =. S; _ S.)— HMX _ S,)— 
Ss [s 
c [d Cc Cc 
— H5-(S, —_ S,)—+H, (X —S,- S,)——> H, (X — S,)—> 
pt+tq=2n 


where the vertical isomorphisms are given by the usual Poincaré duality for 
manifolds. Hence the second line is exact. So, 


H,(X — S, — S)) = H,(X — Sz) ® [5]JH5-(S, — S,) 
where [6] is the map of the homology sequence. 


3. Let {S;},-; be algebraic manifolds in general position in P”. Let 


X=C" () S;, where C" = P" — P%* is the affine space, and J is a subset of J. 
ieJ 


Then 
Ha(X () s,) = ® [3" HS —-1m(X 0) 5:) 


ieI-J HcI-J 


Picard—Lefschetz Formula (a beginning!) 


Consider the case of one hypersurface globally given in P” by one equa- 
tion S(x, t)=0. We restrict severely the type of singularities. 


Assume: (A) For t=0, S has an isolated nondegenerate quadratic 
singularity, that is for ¢=0, S, = 0, but det (Hess S’), 4 0. (B) Localization 
in W, where we can assume W to be the open ball || X|| < 1. 


In X — W, we have an external isotopy which sends 0W into dW since 
everything is transversal outside W. 

For simplicity, we shall suppose that dim 7=1; hence we have a 
family (X¥ x 7, S) over the disk |t] < 1, the singular fiber of the family being 


386 JEAN LASCOUX 


the fiber above t = 0 only, and the singularity satisfying the restriction (A); 
for t ¥ 0 in the disk we can apply the little isotopy theorem in W. 

It is convenient (but not necessary: see the final formulas!) to introduce 
several types of ‘‘homologies.” I hope it can make the picture clearer. Ifh 
is a cycle in Hx°(Y — S), it is natural to take its trace in the open ball W: we 
localize the cycle in a neighborhood of the singularity, 


h~w- try h 


then try A is a cycle with closed support in W, which avoids the manifold S. 

Hence we can cook up the following homology group (usually defined 
by means of a family of supports): H«*(W — S), where the superscript F and 
the underlined W mean that we are interested in cycles with closed supports in 
W — S but the support has also to be a closed set in W! 

Assumption (A) means that in W, there exists a local equation for S of 
the form £x,;? = t._ By choosing such a simple form, we have further assumed 
that no extra complication was coming from the parametrization of the family 
of hypersurfaces over ¢, that 1s, dS/dt # 0. 

Now let us look at W ‘“‘under a microscope.”” Then we are simply 
considering a complex quadric S in C", and we can discover only one genera- 
tor € for H,'(C"—S). (Hence, we choose an orientation.) Similarly, 
looking at the homology group H,_,(S qa W), we find only one generator 
which we shall denote by e and which is explicitly described as follows: 


he 


=f : i 
+ choice of an orientation 
x,/,/t real 


(a complex quadric is the tangent bundle of a real sphere, and can be retracted 
on this real part). But e is homologous to 0 in W. Hence there exists a 
chain (oriented) denoted e(W, S) such that its boundary in S is the cycle e: 


e(W, S)—*+e 


The basic *‘ duality ’’ we shall build in a very complicated manner is described 
by the formula 


K. I(e, e(W, S)) = +1 


where K. / is the old Kronecker index counting intersection with sign. (It is 
also the linking number of «é with e!) 

We have finished setting up a trap for our Feynman cycle h;. If 
try hp ~ ne, then A, will change its homology class when we make a loop a 
around t = Ointhe disk. If try A; is not homologous to a multiple of e, then 
it is homologous to 0; hence it can be pushed away from the ball W which 


Perturbation Theory 387 


localizes the singularity, and therefore is left invariant by the automorphism 
induced by the loop o on the homology group to which h, belongs. 

So let us go into some details of the machinery of this duality : Let 
X = C“-paracompact manifold oriented, and S = union of finite number of 
oriented, closed submanifolds which intersect transversally. 

Let dim ¥ =n=p+q. The duality we are after can be expressed as a 
pairing 

H,'(X — S)@H,(X, S)>C 


Given a homology class a of dimension q, the usual Poincaré duality gives a 
cohomology class ga of the same dimension g. For cohomology with 
support, this duality as formulated by Cartan, lands us in a cohomology 
group denoted by H;?(X — S) (that is, cohomology with the same family of 
supports: closed sets in ¥ — S which are also closed in XY). But this class oa 


is a usual relative cohomology class. At the risk of sounding pedantic call G 
the isomorphism, 


H,?(X — S)—2> H,?(X, S) 


(G is for Godement Theorie des faisceaux or Grothendieck : Local Cohomol- 
ogy.) 

Now, even a physicist can integrate a relative cohomology class G ° ga 
on a compact relative cycle 8 of dimension p. Hence we have a number, 
denoted by 


(xB) = | Go da 


Let us summarize these results in a diagram where S consists of a single 
submanifold of X: 


XE Hi(X — S) 
$Y 
yoxe HX —S) 
{ G 
Goxue HX, S) 


integration over fe HX, S) 


<a| p> 


388 JEAN LASCOUX 


We shall proceed now to construct the same duality for the case of two 
submanifolds S= S, US, by using the exact sequence associated with a 
triple (X, S,, S,). This is the construction Leray develops in Bull. Soc. Math. 
de France (III) 87, pp. 81-180, 1959. 

Introducing exact triangles to represent the associated exact sequences, 
we have successively 


H'(X — S, U S)) 
X% 
Hj(X a S,)—— HIS, =~ $5) 
A[S,] 


where the two interesting maps are: 


1. The intersection by the submanifold S, which is a homology class, 
and gives a map lowering dimension by 2. 

2. The cobord map [8] (Leray) for which we now have two definitions, 
the first one by applying the Poincaré duality to manifolds as we did 
in Corollary 2, the second the geometric definition we met in study- 
ing tubular neighborhoods. 


The two following steps are classical, 


HUX-—S,US, HX. S, U Sz) 
HX — S,)—— HAS, — §,) H"(X, S2) HS). S;) 
x * 


They are the usual exact sequence for a closed set. 
The last stage is reached by 


HWX, S, U S2) 


HX, S))~—— HS. S2) 
S excision 
HS, U S2. $2) 
Note that this last triangle runs in the opposite direction. Duality reverses 
arrows. 


We shall summarize the relevant parts in Fig. 18. The vertical lines are 
isomorphisms. We have the formula 


<a| B> = <a| OB) 


Perturbation Theory 389 


Pa Js 1S 
a 
ae HI(S, — fo ‘LLS, 2 S, U $3) 
os HX, S;) AN 
\ fs NL H?*'( X18, US) 


LAS 


Ope HXS), Sy) —<———5.1X, S, U S;) 
W 


B 
FIGURE 18 


Its ‘“‘true”’ origin is best seen by reverting to the simple case we treated at the 
beginning of this lecture: Let S again be the complex quadric in the ball W 
(See Fig. 19). We had 

KI(e, e(W, S)) = 1 
If we define « = [d]é, 

ée HF_ (Sa W) 
then 

KI(é, e) = +1 


FIGURE 19 


390 JEAN LASCOUX 


Hence, if the Feynman cycle 7, is trapped in the ball W, try A, is the lifting by 
the geometrical cobord map [6] of a cycle h; lying on S: try Ap = ohp and 


KI(try hp, e(W, S)) = K. I( hp, e) 


PART 4: LERAY—THEORY OF RESIDUES 


Let X¥ beanonsingularcomplex analytic manifolddim X¥ = /. Introduce 
local coordinates x = (x,, X2,..-, %,) for the complex structure, but for the 
real C”-differentiable structure, we shall use x = (x,,X,,...,%,;,X,). We 
shall note, for all complex manifolds of codimension 1, their local irreducible 
equation as follows: S, in the neighborhood of y eS, is given by s(x, y) = 0. 

Let (S;) i= 1... be such that ds; 4 0 for x = y and we shall say that 
(S,,..., S,,) are in general position at y e S; = () S; if the (ds,),-, are linearly 
independent. = 

As we are interested in relative cohomology, the basic set up is as 
follows: 

S=S,0S8,0°°::OS,, (intersection) 
This shall carry the residue class, and the cohomology relative to the union of 
submanifolds will be 
S’=S8,US,uU°°uS,, 
Recall. By a regular function, one means a function which is C~-differentiable 


over X. 
A differential form (x) is regular if it has regular coefficients. If 


do = 0, @ is closed. 
Note that it is a differential form in (dx, dx) and that we shall mainly 


consider the homogeneous components of g. Then we shall speak of the 


degree of gy, and of its type [d°@, (p, g)], where p + q = d°@. We shall make 
use of the graded (respectively bigraded) structure of the ring of differential 


forms. 


Recall thatd=0+0 
Definition. ~ has a polar singularity of order p along S if g(x) is regular on 
X — S and if there exists a number p such that s(x, u)? p(x) is regular on_X. 


Now, let us define the residue form. Letdp =0on X — S, let g havea polar 
singularity of order p along S. Then 


d 
Proposition. p = = Aw+6 (R) 


where the restriction of the differential form yw on S, w |, is closed. 


Perturbation Theory 391 


Definition. Res [9] = Ws 


So 
zens ifo=— asia: 
NOTATION. Res[q] 1S ;if @ = , then Res[q] = P 


Let us postpone the proof. Basic for the formula (R), as its form suggests, is 
a division theorem for differential forms. 


Local Division Theorem 


Lemma. Let o(x) be a regular form on_X, then 


Ja form W(x, y) regular 
| dS(x, y) A g(x) = 0 oe near y such that 
p(x) = dS(x, y) A W(x, y) 


Further, wy |, is uniquely defined and if @ is holomorphic, one can choose 
(x, y) holomorphic. 


PROOF (Leray): Take s = x,. 
To get an idea of the difficult refinements this simple division admits, 


let us allow some singularity to S, say a nondegenerate quadratic point: 
(S, = 0 det HessS #0) for x=y 


y isa “simple” singularity of S. Then (De Rham) ‘Let g be holomorphic 
on X. Then there exists a form Ww holomorphic such that g = ds A wW if and 
only if ds A @ = 0 when d°@ </ and, when d°@ = /, that is, 


@o = a(x)dx, A+*: A dx), 
then a(y) = 0.” 
To make this division, one has to introduce the ring A of holomorphic 
functions in a ball around y, small enough so that 


O 0 
(2:00 = = Fae aa = vee Z(X) = =) 
l 


are local coordinates. One —— the 1-jet of the function s(x). The 
main property of (A) is now ‘z,,,(x)a(x) €(/,) = ideal generated by 
(Z,;, Z2,---, %), then a(x) € (/,).” 

We return now to the proof of the Proposition: 

1. Existence: Note that d(s@) = ds A g, since @ is closed. Applying 
the previous lemma, we have 


ds \go=ds a 0 (first division) 


392 JEAN LASCOUX 


which we shall write 
ds A (sg — s0) = 0 


Note that sp — s@ is regular; hence 
sp —-0=dsanwp (second division) 


2. Uniqueness of yw |, : Consider the equation 
d 
AW +0=0 
S 


Hence ds A0=0. Thus (first division) we can divide : there exists w such 
that 0 =ds raw. Now ds a (W+sw)=0. Hence there exists @ such that 
Ww + sw = ds A@ (second division) and @ is regular. Hence y|;5 = 0. 

3. Let us change the local equation (associated vector bundle!) 


, 


S= US 


with u holomorphic; then 


, 


d 
p= Ayre 


where 6’ = 0+ d(log U) A W. 
4. w|, is closed on S. We have 


~S n dy + d0=0 


Hence, by 2, dy |, = 0. 

For the case of quadratic nondegenerate singular points of S, one has 
that the residue form is still “‘holomorphic”’ at the singular point. 

Let me quote without entering into details, a majorization of the residue 
form, very useful in applications: 


w 
ds 


lo. 
< — 
S |ds|,. 


at each point of S. 
To put the residue form to work, let us go back to compact (relative) 
homology. The exact sequence of a triple (XY, S, S’) gives 


H&(X, S’) 
o™ 
H{(X, SU S')——P* HGS, S’) 


with the three well-known homomorphisms. Again, let us introduce for 
S = S, (m = 1) the tubular neighborhood of S: V is a fiber space over S with 


Perturbation Theory 393 


v(x) 


FIGURE 20 


such small fiber that we have the picture shown in Fig. 20. 


u:VwS 


We have two obvious retractions 
i X-S7+~X-V 


and the corresponding maps at the homology level 
v4: H,(X —S,S’))_ H,(X-V,S) 
are an isomorphism, since v is homotopic to the identity on X — S and 
My Hy (S, S') > Hy (X,(X —V) US’) 


which is geometrically constructed as follows: 
Over o, simplex of S, construct the cell p~ '(¢) 


p'(0) = {xe VI w(x) € 9} 
The corresponding oriented cell is given by nyo. One makes certain that 
(Ux 0 — Ofy)o CX —V 
and also sees that intersecting by S gives an inverse map for py. Hence py is 


a famous isomorphism. 


Lemma. Let y be a chain of X which is transversal to S. Let dy = chainy_, 
+chaing. Let h(X — S, S’) be the corresponding homology class [dy], and 
let ACS, S’) = [y. S]. Then dh(S, S’) = ACX — S, S’). 

Denote by V,, u,, 6, a fundamental system of such objects as considered 
above. 


394 JEAN LASCOUX 


Let ~(x) regular on X — S be such that near each point y € S, there exist 
regular forms W(x, y), O(x, y) such that 


_ s(x, ) 


— A W(x, y) + OCx, y) 
s(x, }') 


(x) 
Then we have the following! 
Lemma. |, is independent of y and 


lim J (x) = 2ni [ve 


e-0 


for any compact chain y. 


PROOF. Let o bea simplex of S, take s(x, y) = x,, then 
dx dx 
lim [ (x)= fa vu) = $+: fv”) 
€70*d.¢ “d.0 X41 XN} o 


Hence we have the important residue formula: 


| (x) = 2ni | Res @ 
dh(s, s’) h(s, s’) 
And we have rebuilt in a most natural way the duality of 


Hix, S’) H*(X, S’) 
O i with ~o* restriction map 


Hy(S, ore ate S, S’) H*(S, SO ar rc S, S’) 
The triangles are exact. Most interesting is to look at the exactitude of 
the right triangle, left corner. 


First canonical problem: construct a form with given residue. Let 
V again be a tubular neighborhood of S. 


Lemma. Let g € ¢(S, S') equal the closed differential forms on S, which 
vanish on S’. There exists a differential form we Q(X — S, S’) and two 
differential forms w(x, y), 0(x, y) such that 


wly-y=0 (or Suppwc V) 
d 
S 


with 
vVils=9 Wis =O|s =0 
and dy = 0 in a neighborhood of S. 


Perturbation Theory 395 


PROOF. We have a map ut which is a retraction of Von S such that Vo S’ is 
sent into SAS’. o(u(X)) is closed on (V, S’). Let y(x) a C®-function, | 
near S, 0 in X¥— V. Let W(x) = x(x)e(u(x)) on V and extended by 0 on 
X— V. Then we have achieved |, = 9, w| 5 =0, and dy =O near S. 

Using a partition of unity x(x, y) such that Supp x(x, y) contains the 
set of points x where the local equation s(x, y) is defined, we obtain 


ds(x, Z) 


w(x) = }) x(x, z) es) 


A W(x) 


and the remainder 


A(x, y) = ¥. x(x, z) (log - =) Aw 


Our first canonical problem is then solved by taking 


h* (S, S’) = class of ~ 


Cohomology classes 
h* (X, S’) = class of dw 


We use this solution in the following lemma. 
Lemma. If h*(X, S’) = 0, then (x) is the residue of a form in ®C(X — S, S’) 
with a polar singularity on S of order |. There exists a form o € Q(X, S’) 
such that 

dw = do 
Hence 


d 
w-o= = AWw+t+60-<a 
is a closed differential form. Thus we have constructed the map @* of the 


figure. 
We shall now look at the second canonical problem. 


Lemma. If Res h*(X — S, S’) = 0, then h*(X — S, S’) contains forms with 
nonzero residue (hence polar singularity of order 1). 


Lemma. If h*(S, S’) € Im Res, then any form g € h*(S, S’) is residue form of 
some form in ®(XY — S, S’). 


Lemma. Let g € ®(S, S’) 


~ = Res form of (something) € h* (X — S, S’) 
iff 


1 
eeaaSy * | h* =. , 
~ Es es h*(X — S, S’) 


396 JEAN LASCOUX 


Hence, Theorem | of Leray, which is central if you think of the long path we 
followed: 


Let ~ be a closed form on X¥ — S,0 on S’. Then g is homologous to 
@ in the ring of differential forms Q(X¥ — S,S):@~@_ where @ has 


polar singularities of order 1. The set of the forms s@/ds | 5 is a cohomo- 
logy class of (S, S’). 


NOTE. This would be false if one has replaced the ring Q of regular forms, by 
the ring of holomorphic forms. 

For the case m = 2, if S,, S, have global equations S,(x) = 0, S,(x) = 0, 
one has the division theorem suitably generalized! 


Lemma. Let ~ be a closed, form on X — S,; 4 S, with polar singularity of 
order 1 on S;, respectively, i= 1, 2: Assume S,(X, y)S,(X, y)g regular near 
yeS,S,. Then 
dS, dS, dS, dS, 
ee yee —~aA0,4+—2Aa0 sak 
~ Se eg re 2+@ 

where one deals with polar singularity of order | only. This leads to the 
iterated residue Res” in the same way as the geometric construction of the 
iterated cobord 6”. We shall not follow this very useful generalization— 
instead we use the following for the Residue-form of a product (if S has a 
global equation S(x) = 0). 


Global Division Theorem 


Lemma. If @ isclosed, with polar singularity of order 1 then g = (ds/S) AW +0, 
where the differential forms are defined globally. 
Then we can define the residue of a product. 


Lemma. Let ~,, g, be two closed differential forms with polar singularity of 
order 1 on S, then g, A @, has a polar singularity on S and we have the 
relation 


lS dS 
Res [~, A @2] = Res E A +] A Res [g,] + Res [@,] A Res lo: A + 


PROOF. The proof of the foregoing is obvious if one uses the global division 
formula 


dS 
MHS AW th, 


dS 
@2 = AW2 + 82 


Perturbation Theory 397 


Multiplying the two terms, one gets 
dS 
P oo aera a A8,4+ 0, A W2)+ 9; A 02 


Note that the polar singularity of g@, A @,, which is of order 2, disappears 
(Theorem 1). 

We shall now take up the case of a polar singularity of order k > 1. 
Consider g = w, A ds/s* a closed form, and then, successively 


(—1)" 
dy, =—— 2 dS 
(—1)'-4*! 
dw,_ =WA dS 


Exercise. Prove that one can make these successive divisions. 

Define R(Q) = w| 5. Assume, from now on, the strong restriction that 
S is an algebraic nonsingular variety in P". Consider the line bundle associated 
with SGP", or, rather, to the global equation of S—S(x) = 0. 


(r 


has a canonical section which is precisely this equation. L|S is the normal 
bundle of S in P”; but when patching local coordinates, it is natural to con- 
sider also the n-times twisted form of this normal bundle. 

Using the standard covering of P,, by the (n + 1) affine open sets 


U; = {x; # 0} 


L" is defined by the cocycle (/;/f;)", where f; = 0, f; =0 are the equations 
defining S in U; and U,, respectively. We are introducing these notions 
because they are appropriate for expressing that the global meromorphic 
differential k-form w has polar singularities of order n along some subvariety 
S, or vanishes with order p over some other subvariety S’. Thisis formulated 
as follows: w is a global section of the sheaf of analytic k-forms with coeffi- 
cients in L~" @ L’?, where Land L’ are the line bundles associated with S and 
S’. Hence, it is a statement easy to write down in terms of local coordinates. 
Leray’s theory of residues can be also written (without much profit) in terms of 
sheaves of differentiable, forms, that is, differential forms with differentiable 
coefficients or even weaker properties. 


398 JEAN LASCOUX 


But we can also go in the opposite direction and ask for stronger prop- 
erties of the coefficients. To require analytic or algebraic properties for 
these coefficients seems to lead us inescapably to sheaves and their cohomo- 
logy. I shall simply state some results along these lines and give three 


references. 


1. Complex Manifolds, Lectures by S. S. Chern (University of Chicago 
1956). 

2. Integrals of the second kind on an algebraic variety, Hodge and 
Atiyah, Annals of Math. 62, p. 56, 1955. 

3. Some Results on Moduli and Periods of Integrals on Algebraic Mani- 
folds I, IT, IH (Preprints), P. A. Griffiths (Berkeley University). 


The first interesting property concerns the image of the residue map in the 
algebraic case. Differential forms now have the bigradation given by the 
complex structure (dy, dy) and the cohomology groups are now direct sums 
of the cohomology groups obtained with the differential forms of a given 


type: 
H"(X) = H™(X) + WH" '"(X) 4-5) + HX) 


One has then the following theorem. 


Theorem. Let A’*'(S) be the closed rational (¢ + 1)—forms on X with 
poles of orderr +1 on 5S. 
Then the residue map sends 


AS*(S) Es HS) + HONS) + + HENS) 


in such a way that 4?*'(S)/A%* \(S) maps into H4~""(S). 

With this more algebraic setup, maybe we can now have a vague idea 
of the problem of defining residues on a singular variety S (that is, we shall be 
concerned with differential forms with polar singularity on S—S being 
singular). Consider first the simple case : 

Let X be an algebraic manifold and S an irreducible algebraic sub- 
variety with ordinary singularities, that is, locally S is given by x, ... x, = 0. 
[(x, ... X,) = local coordinates on X]. Since S is irreducible, only the local 
problem looks like our initial family (XY x 7, Y), where Y was reducible. 
Then there exists a canonical model S called the normalization of S which is 
a nonsingular algebraic manifold. 

Thus, we are going a step further: the use of Hironaka’s resolution of 
singularities reduces us to the case of algebraic hypersurfaces with normal 
crossings. Here to interpret our residue on an irreducible hypersurface with 
normal crossings, we are led to consider coverings. In fact, the normalized 
variety S is a nice covering of S for which the k local sheets of S which meet 


Perturbation Theory 399 


transversally in a neighborhood of the “‘ ordinary singularity x, ...+, =0” 
are separated. Let 2 be the covering map 


5 

{7 

sey 
Recall that the embedding of S into X determines a line bundle over X, that 
we shall call ZL. Recall that X as an algebraic manifold carries a canonical 
bundle K,. (See Professor’s Bott lecture on the Jacobian bundle.) 

Recall now that the local sections of an analytic vector bundle define an 

object called a sheaf, which—if it originates from a vector bundle—is locally 
free, hence coherent. 


Recall that the map z being proper one can define for coherent sheaves 
the direct image by the map z: 


z 


al 


Theorem. If F is a coherent sheaf on S, then, z, F is a coherent sheaf on S. 


All these notions are forced upon us because, except for drastic restric- 
tions on the map z, one cannot expect the direct image of a vector bundle, 
which is a nice geometric object, to be a vector bundle itself; or, in other 
words: Vector bundles can be pulled back but not torn down. 

Yet, using only pull-back we have the geometrical picture that follows: 
Over X we have two line bundles L and Ky; construct the product Ky @ L™' 
of the two vector bundles K, and L~'. Then by pulling back, we can con- 
struct a vector bundle over S. We shall denote it by x” '(K, ® L“') [instead 
of the correct expression (io z)~'): 


Ky @ Lo “n(Ky @ L7!) 


XW 5s<s 


Look at the ‘‘sheaf” over S of (local) sections of 2~'(Ky@L™') 
which vanish on &, where © is the singular locus of S and z~' x = &. Call this 
sheaf O(K, ®L™~'),. It isa subsheaf of the sheaf of all local sections. Then 
we have the classical adjunct formula of the Italian geometers in a disguised 
form: The sheaf O(K, @ L™'), is isomorphic to the sheaf of local sections of 


400 JEAN LASCOUX 


the canonical bundle of § — O(Ky @ L™'), ~ O(K;). The proof of all this 
high-brow construction is so easy that I most strongly recommend looking 
it up in Griffiths’ work (III, p. 61). 

The map giving the isomorphism is the residue map R followed by 
the lifting z*. So let SC. X be an embedding with ordinary singularities 
(Hironaka). Let S, be the subvariety of S of points of order k through 
which k sheets of S intersect transversally. Then S, is given locally by 
{x, =0... x, = 0} if S is given locally as S = (x, ... x, = 0); so that S— S, 
is the nonsingular part of S, S, — S, is the nonsingular part of S, etc. 

We are again led to consider a nice stratification of S. Let us call the 


strata Yv 


Let ¥,=2 ! Y, be the inverse image of Y, in S. Then Y, is an un- 
branched covering with v sheets above Y,. 


Griffiths’ Proposition 


Let N, be the normal bundle of Y,in X. Then taking germs of its local 
sections and denoting this sheaf O(N,), one has 


O(N,) = O(L) @g,, Oy, 


Now let us again define A‘(k) as the closed g-forms on X with poles of 
order k + 1 along S. 

We would like to apply to A%(k) a generalization of the residue map 
which holds for the case where S is singular. Some examples of this situation 
appear in our joint work with Regge, where we are dealing with algebraic 
differential forms with polar singularities on a linear family of hypersurfaces. 
Bertini’s theorem asserts that the generic hypersurfaces has no singularity 
outside the basic locus of the family. But the physics of the problem, where 
we are dealing with a general diagram with more than one loop, oblige us to 
consider the case where the generic hypersurface is singular on this fixed 
locus. 

It turns out that the residue map can still be defined on a linear subset 
of the differential forms A%(k). We shall denote this vector space by A%(k), . 
For k = 0 the conditions to define the residue map R are linear: one requires 
that the differential forms vanish on the singular locus of S. For k > 0 they 
are still linear but more complicated to state. Griffiths sketches the definition 


Perturbation Theory 401 


only for the case of curves and surfaces. I shall refer to his paper and to 
Hodge and Atiyah. 

The geometrical problem associated with these conditions is the relation- 
ship between the homological properties of S and of its desingularization 
5. Let f be the blowing down map from S to § 


R 


<——_Nn 


Sf 
S Ce» X 


Then the differential forms on S are mapped into the differential forms on S. 

What is obtained by the operations of restriction and residue from the 
differential forms of X is the cohomology of §. We shall put on record this 
geometric interpretation of the residue map for the singular case. 


Theorem. Let S be a hypersurface in X¥. Then there exists a residue map R 
such that 


R: At*'(k), > Y HIF 4S) 
Jk 


and further 


y H4~44(8) =)! H14-4-4(X) ® R(A‘* '(k),) 
J<k 
What was achieved by the use of Hironaka’s resolution of singularities and 
normalization was to bring us back to a situation quite similar to our starting 
simple case where the hypersurface S was not singular. 

As in the following we shall have to deal with a linear family of hyper- 
surfaces in P” which has a fixed basis consisting of several linear subspaces of 
P", I want to record here a useful formula which shows how one can still 
calculate the relevant homology group after a ‘‘ permissible”? blowing up. 


Theorem. (Denniston, Annals of Mathematics p. 10, 1956.) Let V be the 
inverse image of a nonsingular algebraic variety V in P” under the monoidal 
transformation having V asthe center. Let P” be the transform of P”. Then 


H(P) = H(P)® Y H,-2(V) 


V is just a fiber space over V with projective fiber P"""~' (m= dim V). 
Introducing Euler’s characteristic (see Regge’s talk), we have 


y(V) = PWV) 


402 JEAN LASCOUX 


PART 5: FEYNMAN INTEGRALS REVISITED— 
THE a-REPRESENTATION 


Let F be a holomorphic application of a complex analytic manifold X* 
into X 


F=xX*AX 


Then we have an induced map on the differential forms F* defined as follows: 
(F*p)(x*) is the form g(F(x*)). Let us restrict rather severely the map F. It 
can be quite generalized but, for simplicity, we follow Leray. 

Let F~'(S) be the set of points that F maps onto S. We shall assume 
F, S, S’ such that F~'(S), F~'(S’) are also analytic manifolds, in general 
position again; we shall also require that the transforms of the local irreducible 
equations S,F(X*), F(y*)) = 0 still be local irreducible equations. 

Then the main functorial property of this change of variables is the 
commutation of Res with F*: 


p(2 
dS\ 


It is a very interesting circle of equations which deserves a more complete 
investigation elsewhere. Trying to get rid of one particular integral repre- 
sentation in the hope of getting a “‘ very good” one, we are led to consider all 
schemes of the form Y% T and the relations (or maps) between them: 


_ F*w 
~ dF*S 


F~-'(S) 


V*x Toop yy T 


If they define the same germ of analytic functions, we should expect that 
n,(7-L) operates ‘in a very similar manner” on their homology vector 
spaces or, rather, since z,(7-L) operates only through the *‘ connection ” which 
lifts vector fields from 7-L into vertical vector fields in X, that we have the 
same representation of 2,(7-L). Indeed, something like a universal model M 
exists, but I shall refer to Griffiths’ again (I, II, III) and his study of the map 
®: (JT — L) > M, where ® 1s obtained by studying all the periods of the basic 
differential forms over a basis of the homology cycles. I shall only make a 
remark: we are almost in a situation first considered by Grothendieck, and 
proved by Borel—-Narashiman in the complex analytic case. I state only a 
simplified version. 


Perturbation Theory 403 


Borel—Narashiman Theorem 


Let Y be the complement in a compact analytic space T of an analytic 
set of codimension |. Let M be a complex space which admits as non- 
ramified covering a bounded, open set in some complex space C™. 

Then, if two holomorphic maps 


0,0’: Y3M 
take the same values in a point f, € Y, 
O(t>) = O'(to) = moe M 
and if they have the same action on 7,(Y): 


®.7,(Y) = ®.'7,(Y) 

then ® = 9’, 

To use (if possible) this theorem, the main problem here is what com- 
pactification of the parameter space T we have to choose? 

My main purpose is to introduce you in a pedestrian way to a wider 
generalization. 

There are at least two unsatisfactory features in our study of the 
k-integrals : first, Lorentz invariance was hidden although all ‘‘ singularities ”’ 


were obviously Lorentz invariant; second, the basic set (k,,...,k,) is 
arbitrary. One could perform a unimodular substitution (k,’,...,k,’) = 
A(k,,...,k,) on the set of four-momenta 
k,’ = > Ai; k; 
J 


To get rid of these two defects, one performs a special form of Fourier 
transformation, to be called the Hankel transformation, which belongs 
properly to the subject of the lectures given by Professors Helgason and 
Ehrenpreis by its group theoretical flavor. 


To begin simply, we write, following Feynman and Bogoliubov 
] a 
bas | giala? —m? + ic) 
q*—m?’ + ie J aa 


So a is the Fourier conjugate variable to (g* — m7’). Substitute for each 
internal line of the Feynman integral instead of 
1 oe) 
eee, aw: | da. eiai(q? —m? + ie) 
q;?7 — m? + ie o | 
So we have one a-parameter for each internal line. Now 


F,(t) = J dk, AN dk, 0... A dk, J, de A da, A... A da, eb2%P® 


404 JEAN LASCOUX 


where 


n 2 n 
O(a, p, k) = Yaly ejjkj + » Nij P,) = ( 2) (m’ — ie) 
i= Jj J i= 
is quadratic in the k-variables (and in the external momenta p,). 


Imagine that you refer this k-quadric to its axis, then to its center 
(Morse), and then perform the easy integration on the new k-variables. One 


obtains 
F,(t) = { da{ {ak e2) 


Note the interchange of the order of integration. The end result is as follows: 


Symanzik’s Formula 


6? pe day A... A bn epg, a)/dg(a)—E mera] 
F(t) = ———,—— e 
0 0 dg(a) 


where D,(t, «)/d¢(«) is a rational, homogeneous of degree | function of the 
parameter a, which is built as follows: 


Rule. Let T be a cut separating G into two connected pieces G,', G,* 


Let 


where 


Perturbation Theory 405 


refers to the sum of all external momenta which end up in G;. Then 


D,(t, «) = 3 Zr I] dg,r(a) dg,r (a) 


and 
dete) = 5 ( I]. 


where T runs over all possible choices of a tree connecting all the vertices of G, 
and 7* is the complement of T 


T + T* = All internal lines of G 


If T* is empty, put dg =1. Note that d°D, isr +1, d°dg is r, r being the 
number of independent cycles of G. The last touch is to introduce a; = 1a; 
and to get rid of the exponential by performing the integration over A, but for 
purposes of ‘‘renormalization”’ it is often convenient not to do this last step 
(see Hepp). 

What we shall proceed to “‘idealize mathematically,”’ then, is an integral 
of the form 


dca)" 2"-? 
Dott, 0) - ($ mins )do(a)] 


Where 7” is the fundamental “‘affine”’ simplex a; > 0, 


F(t) = [de A... A dG} 


This can be easily put in projective form and we end up with the following 
setup: 

In the complex projective space P"', let W be (with respect to the 
variables z) a linear family of hypersurfaces of degree r + 1 of the equation 


Ag(a, z) = Dg(a, z) — m?( Ya )do() 


Let h be a relative cycle in 
H,-1( Pt"! — WC) (tl, - W)) 
i=1 


where ITI; is the complex hyperplane with equation a;=0. Let be the 
(n — 1)-projective holomorphic form of P"~': 


n=), (-I'o; da, A daz A... AN da; A... A da, 
i= 


Assume W to be nonsingular and transversal to the n complex faces of the 
fundamental simplex 7,", that is, to the n-hyperplanes IT,;, at least for a 


406 JEAN LASCOUX 


generic value of ¢. _In other words, (W,,, I,,..., T,) are in general position. 
Then our idealized model is 


1 n-2r-2 
ee ee 


Jb A(a, zyr> et Y | 


Assume for simplicity n — 2r —-22>0. The integrand is trivally a closed 
rational (n — |) form with polar singularity of order (n — 2r) on W. The 
restriction of this closed form to the II; vanishes for there is not ‘‘enough 
dimension” for the differential form, holomorphic on P"~' — W to survive. 
Construct, as before, the new family (P"~' x T, F), where ¥ is the 
bunch formed by W,, (linear system) and the fixed hyperplanes IT; ((= 1... 7). 
Now the same algebraic set as before L of codimension | is obtained as the 
locus of the points te 7, where the little isotopy theorem does not apply, 
that Is, 
|. W, is # to the [1;, but is singular. In order that W, be singular, ¢ 
should be such that the system of n homogeneous equations of 
d° T in the variables (a;) has a nontrivial solution: 


Van der Waerden R,(z) = 0 


2. Let Il, = ()\ 1,;, where / is a subset 7c [1 ... 2]; then W, is tangent 
iel 
to II,, that is, W, 0 TI, is singular. Again the system below has a 
nontrivial solution if : 


Van der Waerden R,(z) = 0 


I shall not develop again the interesting interrelationship between elimination 
theory and the geometric singularity of the family (P”"’ x 7, S). 

The interesting new viewpoint we gained is that the tangential properties 
of W, came neatly into focus. It would be quite natural, if 1 had time, to 
develop at this point Leray’s second important memoir relevant to our prob- 


Perturbation Theory 407 


lem. Instead, I shall write a long appendix, since I think this approach is best 
suited for physicists, and sketch here briefly the main ideas involved. 


Differential Forms and Functions with Algebraic Singular 
Support (Nilsson) 


Definition. Let A, consist of all functions fsuch that there exists an algebraic 
manifold V, in C” with equation p(X) = 0 and such that 


|. fis regular analytic in C" — V, and in general multi-valued. 

2. All the determinations of fin any point x9 € C” — V, span a linear 
space over C with finite dimension. 

3. There exists in xy» €C" — V,, a real number a, a polynomial R(x) such 
that for every determination f, of fat x,., there exists a constant C 
such that 

(1 + [x\)° 

[R(x)| 


for all paths y of rank (n,, 2) and starting at x°:fo, is the analytic 
continuation of f, along y. 


| foy()| < C 


The paths one uses are of a special nature. We shall call V, the alge- 
braic singular support of f and note 


We want to restrict the spiralling around V,. Hence: 


Definition of the Rank of a Path. Let y = x(t) = (x,(t), ..., x,(¢)) consist of a 
finite number of pieces where the components (x,t)) are regular algebraic 
functions of t—in fact, polynomials. If the number of pieces is smaller than 
n, and if n, is larger or equal to the degrees of the polynomials then y is said to 
be of rank (n,, 7,). 


The local form of this notion—that is, fis of class A in an open neighbor- 
hood of a point—implies that analytic continuation of the function is of class 
A. Hence, for our germ f there exists a globally defined algebraic variety, 
V, in C" such that the analytic continuation is regular on C" — V,. 

Let us now consider an algebraic manifold V(t) c C", where te C™ and 
V(t) is given by (n — r) equations 

s,(x, t) =0 s(x, t)=0...5,_,(x, t) =90 


Let R(t) be the set of points xe V,, where the gradients s, ,, 
S29 +++5 Sa-r,x are linearly independent over C; R(t) is an open algebraic 
manifold of dimension r. Introducing local coordinates, a holomorphic 
p-form on R(t) can be written locally as 


w(x) = ¥ a(t, x) dx! 


408 JEAN LASCOUX 


where 
dx! =— dx;, A dx;, N..e A dX, 


l=[i, <i, <°°' <i] 


The definition of an w, belonging to class A, is now straightforward. Re- 
quire a,(t, x) € A,. Denote by SS[w] the union of the singular supports of 
the coefficients of the differential form. 


Main Theorem. Let y(t) be a compact piecewise differentiable cycle on 
R(t). Let w, be a closed holomorphic differential form on R(t). Suppose 
(1) y(t) depends continuously on ¢, and (2) w, is of class A. Then 


g(t)=[{ (x) 
y(t) 
is of class A. 


Application—Feynman integrals and their compositions such as defined 
by unitary relations belong to class A. 

A cycle such as y(t) is termed regular. The proof is by induction from 
the one-dimensional case. 


Lemma I (Nilsson-Leray). Let f(x, y) be of class A with respect to x € C” 
and yeC. Let V, be given by s(x,t)=0. Let m,(x), y2(x) be algebraic 
functions on C” such that 


(x, 11(x)) € Vy 
(x, n2(x)) ¢ Vy 


for some x and all determinations of the algebraic functions y,(x). Then 


n2(x) 
g(x)=| f(x,y) dy 
m1(x) 
belongs to class A/(x), where the path of integration (y,(x), y2(x)) does not 
meet V,;. We can further describe the algebraic singular support of g: 


x|the functions y,(x), which are the roots of S(x, y(x)) =0 
X-—Vg= are regular and distinct and the n,(x) are regular and do not 
meet V,, that is, S(x, n,(x)) 4 0 


Note that this lemma involves integration on a relative cycle 
O(n, (x), N2(x)) = n2(x) — \(x). The main lemma could be generalized to 
y(t) being also a relative cycle. 

It is impossible to quote here all the references tn the physical literature 
where Lemma | appears. I think that one can say that it was rightly under- 
stood from the beginning that it was basic. 


Perturbation Theory 409 


Moving on, let us subdivide the cycle y into small simplexes. Now, we 
note that: A simplex o is given either by its vertices or by the Grassmann 
coordinates of its faces and support. We can define the motion of ao by two 
means, following either the vertices or the faces. 

Then we are back to the basic problem we were faced with ‘using the 
a-form of Feynman integrals. Simply, we now have to be prepared to move 
also the planes IT, of our simplex, and to discuss also differential forms of any 
dimension. What Leray proceeds to do is to discuss our algebraic set of 
equations (A) using the Cayley form of W, and the Grassmann coordinates of 
the simplex, then to glue back the simplices 1n a cycle y, eliminating the para- 
sitic singularities given by the planes of the subdivision. The notion he 
introduces to discuss the dual form of the system A is the notion of 
‘‘appui”’ of a g-plane on the algebraic variety W,,) is fundamental, but I have 
no time left: see the appendix for more details. 

We shall end up by making Lemma | more precise. Let us discuss the 
relevance question for the singular algebraic support V, of the multivalued 
function g. We give the Picard-Lefschetz formula as an answer to the rele- 
vance problem. 

Assume W, in general position with the fundamental simplex. Then 
consider the exact triple 


n OC n 
H,(P" - Wel (mt; — Wat) (™;—- WJ (7; — W,)) 


H,(P"— W, ’ (aj - W.)) 


JEeA 


A is a subset of [1 ...”]. Apply to this exact sequence the little isotopy 
theorem where it is assumed we are in a neighborhood of f) such that 
R,(to) =0. Assume the singularity to be localized in the face 


,=()1, 


jeEeA 


Let o be a loop around R,(t)=0. Then we can see that 


H.( U) (1 = %), U1, - 9) 
i=1 jJéeA 
is invariant since it is isomorphic to 


excision 


Hal (mi — W). Um = WKH 9 UD 


JEA i¢A i¢A ieA 


H,(\) (%;— W)) Hal) (7;-W,) a ) (1; — W) 
i JEA 


ieA 


410 JEAN LASCOUX 


This second triangle makes the invariance of 


Ho Ut - # Ut - ¥9) 


more obvious; hence the critical module will be in the kernel of 0, that is, the 
image of 


.(Pr— W,, UW) 
It is easy to make this a little more precise for a simple quadratic non- 


degenerate singularity, localized in a ball B. The change of homology classes 
is given by hh + a(h)de, where oe is the image in 


H(P —-W m1 
ie A 
by the map o of the vanishing cycle in 
HM, ) m1 
ie A 


The topological criteria for relevance is obtained by introducing as 
before 


o(B, W, UJ (il, — )) € H,(B, m UT) 


ieA 


The vanishing cycle proper is 


ec Hy-1(W(\B, U m1 


ie A 
and the trace of 4 in B, denoted as 
hye H,é(B - () 1; - w) 
ieA 
We have again by taking the relative boundary on W 
Owe =e 


Descending further on the face ( \II; we have easily 
ieA 


hy! = O° Om—1 077° 0 Oyhy 
el4l = 9 od, 0° o dye 


where we put finally A = [l,...,m] and |A| =m. The final formula is the 
topological expression of the number «(h) by a Kronecker index: 


h—+h + <h4, e4>d5ye 


Perturbation Theory 411 


PART 6: WORK IN PROGRESS 


In this last lecture, I want to review several points which are relevant to 
our problem: 


1. Zeeman Theorem* 
We have a spectral sequence for an arbitrary topological space X: 
E: HX, 2,)>H,-.<X) 
£, is the presheaf U ww # (U) = H,(X, X — U) 
Then to H,(X) is associated a filtration 
Fo =H(X)> F,) 3-3 Ff > Fett 3 
and for the dual spectral sequence £ 
HX) >--- me Sf > a Si," 

Then, if X is a polyhedron, we have: 


Theorem 1. 
Hi4,0H' c F,! 
For a cohomological class €, let 


cod € = inf (dim supp x) 
xeg 


Theorem 1’. If X is compact, then 


cod € > filtration of € 


2. Lefschetz Theorems 


Let W be a projective, nonsingular, irreducible, of dimension n, alge- 
braic variety, V the corresponding affine variety V = W—P,,, Wo a hyper- 
plane section, and V, a similar quantity. Then: 


Theorem 2. 
H,,- (Wo) = L + Inv H,,_ ,(Wo) 


where L is a module generated by the vanishing cycles (6,). 


* Proc. London Math. Soc., p. 155 (1963). 


412 JEAN LASCOUX 


Theorem 2’. 
0- Inv H,- (Vo) + L> A,- (Vo) 7 L 7 0 
where 
L = ker (H,_2(P) > H,-2(Wo)) 


where P is the axis of the pencil of hyperplane sections. Further, recently 
Moisheson has proved that the group of automorphisms corresponding to the 
Picard—Lefschetz formula for the. vanishing cycles (6,) is transitive on L. 


3. Manin’s Theory 


Consider the family of elliptic curves 
C, = {y? = x(x — 1)(x- 1} 


The Landau surfaces are t=0, t= 1. Consider a period (integral of first 
kind). 


F(t) = ie he H,(C) 


Let w(x, t) be this algebraic differential of first kind. Then the differ- 
ential operator 
d? d 1 
L,=t(0 —t)—3--—(2t-1)—+- 
pa en) ae a 
transforms w into an exact form. 
Hence, F(t) is solution of the Gauss hypergeometric equation 


L,F =0 


In fact, Manin even considers the fundamental simplex defined by two sections 
of the family (P,(t), Qo(t)) and the integral 


Qo(t) 
w 
Po(t) 


Hence, it is interesting to make the differential operators with rational coeffi- 
cients of the parameter variety T act on the cohomology of the fibers. 

The vector fields of T on the complement of L act as an r-family 
(r= dim 7) of integrable connections (K(t) = curvature = 0 for te T—L). 
Speaking intuitively, one can say that the curvature is a distribution concen- 
trated on L since, around L, the homology classes change according. the 
Picard—Lefschetz formula. 


Perturbation Theory 413 


There is evidence that the following is true: Let L, be the set of regular 
points of L. Then, if LZ, is locally given by an equation R(t) = 0, {R(t) = &} 
defines locally a tubular neighborhood of L, which can be shrunk to L, in such 
a way that on L, J a(r — 1)-family of integrable connections (note that the 
fibers above L, are singular). Hence, the distribution character of the curva- 
ture lies in the “transverse direction to L,.’”’ Does it make sense to write 
K(t) ~ 6(R,(t)) or some transverse derivative of 6? (See Gelfand-Shilov 
Appendix Vol. | and 5 for the definition of 6-function for complex variables.) 


4. Algebraic De Rham Theorem 


We thought to apply differential operators to algebraic cohomology 
classes, rationally dependent on t € 7, when we came to learn that the follow- 
ing theorem was true (Grothendieck IHES, No. 29, p. 351): Let X be a non- 
singular affine variety then 
(Closed algebraic forms) 


XO (exact) 


Let W be a hypersurface of P". Then, H* (P" — W) can be calculated 
by classifying the algebraic closed differential forms with polar singularity of 
order k on W (and on W only). 

Let D=0 be the projective equation of W. For the n-differential 
forms, the needed algebraic division theorem is provided by the simple. 


Lemma. Let n be the fundamental n-projective form. 

Let w = nH / D* be aclosed differential rational form on P" — W, with H 
a homogeneous polynomial such that d@’H+n=k-d°D then w~a’' = 
nH'/D*~', iff He (dD/éa;) ideal generated by the partial derivatives. We 
conjecture that if W is given by Dd =0 (hence, is reducible), then 


Lemma. 


/ 


nH nH’ 
= PO) ee 
D™d" D”™ d" 


with (m’, n’) < (m, n) iff 
nee eee 
Oa, Ou, On, 0x; Oa; Cu, 
Note that these ideals be considered as modules over the rational functions of 
T (in fact, polynomials in ¢ suffice). 


Recall. 
d( a)" —2r-2 


FA0= | tp ee 


414 JEAN LASCOUX 


if 2n — 2r —2 < 0, then the factor d in the denominator gives rise to diver- 
gence. Noting that F,(t) is to be interpreted as a distribution in ¢ 


_ n 9(z) dz 
(Fo, 9) = {{ Dio, z)"> 2*d(ay 22 


T(a)" x T(z) 


we see that integrating by parts allows us to change the multiplicity (n — 2r) of 


D(a, 2) = 2,AKa) ~ m*( Ya) do 
since 
7) 1 Aa) 


dz, D(a, z) — D(a, z)? 


Recall the role that the finite dimension of the linear space of all determinations 
was playing in the definition of the class A. The differential equations which 
justify the name of “‘hyperfuchsian functions” for F,(t) are formed by 
applying differential operators with rational coefficients in f¢ to the closed 
differential forms w(x, t) in such a way that 


» Li(thoo(x, t) = d,n(x, t) 


or, differentiating through a single parameter (0/0z, 07/dz’, ...) the cohomo- 
logy class w, we stop when the new cohomology classes (0'w/dz') are no longer 
linearly independent over the rational functions of 7, that is, 


ol 
>; az) =i = dn 


Note that this justifies also the notion that the connection integrable over 
T-L, replaces the fundamental group IT1,(7-L) which is ordinarily used to form 
the differential equations with uniform coefficients. We checked that this 
procedure applied to our special situation (relative homology class for integ- 
ration) (see Regge’s lecture), and was leading again to the elimination theory 
and subsequent formation of discriminants we meet when dealing with the 
Landau set L. We were led back to rediscover the way Macaulay came to 
write such resultants as “large determinants.” 

The most striking form one can give to this algebraic division theorem 
(to reduce the order of the polar singularity of the algebraic differential form) 
is the following formulation of a theorem of Macaulay. 


Lemma (Griffiths-Mumford). Let Q,...@Q, be homogeneous poly- 
nomials in (a,...4,,) Of degrees (r,...7,,), such that the radical of the 


ideal (Opes Q,,) 
sf (Oi se8s On) = Oya vt, 


Perturbation Theory 415 


then 


Qe(Q,,...,0,)+m" 


mic 1113 On) [oem m 
Qin" <(Q1,..-s Qn) on ee 
j=l 
Note this gives a bound for dim H*(P" — W) taking Q; = 0D/0x,. 
Needless to say, the algebraic calculation agrees with the topological 
formula of Hirzebruch when W is nonsingular. For the general case, it seems 
that the K-theory for the algebraic coherent sheaves and the corresponding 
theory of intersection of algebraic cycles can be adapted to the differentiable 
case and hence could relate to Thom’s stratification and isotopy theorem 
which we can vizualize more easily. 
To end well, we shall begin to study an example: It is the triangle 
diagram (Fig. 21) which helped Kallen and Wightman to calculate the envelope 


FIGURE 21 


of holomorphy of the three-point function in axiomatic field theory. Let us 
work out some details: 


F(Z, Z2, 23) 
1 a, da, A da; —a, da, A da, +a, da, A da, 
3dr (2102 3 + 22003 + 230 ,a, — m*(a, + 2 + 5)?)(a, + 2 + 43) 
where the domain of integration (Fig. 22) is 


TT? = a,20 a, 20 
(| a, +a,<1 


416 JEAN LASCOUX 


FIGURE 22 


The Landau equations become 


ia —2m Z3—2m? z,—2m? 
Oo 
aD (symbol A. Denote this 
=— =0)=/2,;-—2m? —2m?  z,;—2m?|}=0 branch of the Landau 
0a, 
set by Q) 

oD ; ; : 
=— =0 Z2—2m* z,—2m —2m 
003 

X3 = 
oD = —2m? =z, — 2m? symbol 
00. = = (z,; — 4m’)z, =0 

0D _ Zy3—2m? —2m? 
Ou, 


and by permutation. 


(Zz, = 4m?)z, = 


| >| 


from the factor (a, + a, + 3) comes also the Landau surface 


Li + Zoe + ae = 22425 — 22525 —= 22, Z1 = 0 


Perturbation Theory 417 


which does not depend on the internal masses and expresses the vanishing of 
the gram determinant (but we shall not be concerned with this). 


( Py” Pi ° ig —0 
a hS 
Pi*P2 ~~ P2 

Fix z;< 4m? and study the multivalued analytic function F,(z,, z>) 
by first plotting the Landau curves (see Fig. 23): Then an easy majorization 
lemma on real quadratic forms gives F,(z,, Z,) real analytic in the hatched 
region, since D(a, z) does not vanish for « € T? and Rez in this region. 


FIGURE 23 


To each Landau curve is associated, in the a-integral ‘‘above,” a 
vanishing cycle which describes topologically the relevance of the ramification 
for our function F,(z,, Z,) and its “‘ periods.” 

To have an example, let us study the absorptive part for z,; ~ 4m?, but 
starting from a real value (z,, Zz) in the hatched region. It is easy to see that 
the Landau curve z, = 4m? is relevant and that the discontinuity of F,(z,, Z2) 
is given by the integral 


da, A da, 
be, (Z 407 M3 + 27 Oy Hs + 24H ,H, — M(H, +, + H5)7?MOH, + H, + O35) 


= | Res [ ] da, 


where e, is the relative cycle of the next figure, and de, the Leray cobord 
‘““normal to W (see Fig. 24).’’ Note then that the point of tangency between 
the Landau curves @ and z, = 4m? correspond to the ‘interference”’ of two 
singularities of different types 


418 JEAN LASCOUX 


FIGURE 24 


(1) Z, =4m? > W tangent to «a, = 0 
(2) (z,;, Z2)€ O— W singular, that is, degenerating in two lines since W is a 
conic. 


The local homotopy group near A, II,(7-L), is easy to study. 

Surround A by a small sphere S°, and look at the intersection of S* with 
Q orz, =4m?. Using Hopf’s fibering and the homotopy exact sequence of a 
fibering (see Professor Bott’s lecture) one finds that II°°(7-L) has two 
degenerators a, b, one for each Landau surface with the relation (ab)? = (ba)?. 
Hence II°°(7-L) is not Abelian 


a: loop around Q, 


b: loop around the threshold z, = 4m? 


In the same manner, we have for B (ac)* = (ca)*, where o is a loop around 
Zz, =4m?. Further bc = cb. 

This obviously implies relations between the “ periods”’ as you can see 
by applying the explicit description we have given of the actions on the 
homology groups corresponding to the generators a, b. Hence we are led 
once again to consider the representation of 2,(7-L) in the vector space of 


Perturbation Theory 419 


the homology classes which is, in this case, a vector space V of easily com- 
puted dimension (dim; V = 7). 

Consider the kernel of this representation : II,(7-L) *,End V. It is 
an invariant subgroup. 

Hence, there exists a universal covering space for all determinations. 
But we deal with a particular form w,;. It isa linear functional on V; hence 
has a kernel which is composed by the cycle o on which 


J, 
Co 


vanishes. Hence the covering space associated to 


F(z) = [oF 


is in general non-Galoisian. 

A last important question: Is the representation (p) irreducible? 
Physical reasoning (things are so nicely ordered in the world, hence in the 
physical region) leads us to suspect a hierarchy; hence, a filtration of nested 
vector spaces: the successive discontinuities seem to have fewer singularities 
than the initial integral. 

Maybe we were misled; the global (relations between the periods in the 
generic fiber) is “local”? and we need something like the filtration which 
occurs in the local study of the automorphisms induced by the fundamental 
group of the parameter variety of a general algebraic family. 

For all this, I shall refer you to Griffiths’ version of Grothendieck 
theorem (see Griffiths Part III, p. 217), or to Grothendieck himself. 

The geometrical proof of Griffiths is by induction: One introduces 
‘nice’? pencils of hyperplane sections and construct the Picard—Lefschetz 
formula from the formula valid for the hyperplane section, by proving a 
relative Picard—Lefschetz. We shall leave the subject here with the remark 
that the study of the unitarity relations needs these generalizations of the 
Picard—Lefschetz formula (see also Pham, Bull. Soc. Math. France, 93, 333, 
1965). 


REFERENCES 


1. R. Thom, Differential Analysis (Bombay 1964), p. 191; and Battelle Seminar, 
Seattle, 1967. 

2. Hironaka, “‘On the Equivalence of Singularities, I, in Arithmetical 
Algebraic Geometry (Purdue University, 1963), p. 153. “‘Resolution of Singularities 
of an Algebraic Variety,”’ Ann. Math. 79, 109-326 (1964). 

3. O. Zariski, ‘“‘Studies in Equi-Singularity, I, II,” Am. J. Math. 


4. J. Leray, ‘‘Complément a un théoréme de J. Nilsson.’’ (This last reference 
will appear soon in Bull. Soc. Math. France, hence will replace the intended appen- 
dix.) 


XV 


Landau Singularities 
in the Physical Region 


FREDERIC PHAM 


1 The Kinematics of Multiple Scattering Processes 420 
2 The Physical Meaning of Landau Singularities 428 
References 432 


Let me start by recalling (cf. Lascoux’s lectures) that one calls “‘ Landau 
singularities’’ the singularities of Feynman integrals considered as analytic 
functions of the “‘external’’ particle momenta in the corresponding Feynman 
graphs. It is not my aim to explain how Feynman integrals were introduced 
in physics, and why physicists have been interested 1n studying their singulari- 
ties even though their faith in the integrals themselves was shaken; nor will 
I speak of the sophisticated motivations (the building of “‘ dispersion rela- 
tions’’) which led physicists to study singularities occurring for complex 
values of the momenta. Instead, I want to show that for real momenta, in 
the so-called ‘“‘physical region,’ Landau singularities are rather simple 
geometrical objects which can be introduced directly by purely ‘* kinematical ” 
considerations and, therefore, should play an important role in any reasonable 
theory of elementary processes, whatever the “‘dynamical”’ content of such 
a theory may be. My talk will be divided into two parts: the first part will 
introduce the geometrical objects called “‘ Landau singularities,’ and study 
their properties; the second part will deal with their physical meaning, and 
the way they fit in various theories of elementary processes. 


1 THE KINEMATICS OF MULTIPLE SCATTERING 
PROCESSES 


A ‘‘collision experiment’? (also called “‘scattering experiment’’) 
consists of sending some particles toward each other and looking at what 


420 


Landau Singularities in the Physical Region 421 


comes out, measuring as accurately as possible the momenta of the incoming 
and outgoing particles. Let us recall basic notions of relativistic kinematics: 
if p;¢ R® denotes the momentum of a particle 7, of mass m,, its energy p,° 
is given by p,° =,/ p;* + m,?; to have things expressed in a covariant way, 
one defines the four-momentum 


P; = (p;°, p;) € R* 


restricted to the “mass shell”’ 

M; = {p; € R*| p,” = m,’, p,° > 0} 
(where p;* = (p;°)? — p,;* denotes the scalar product of p,; with itself in the 
Lorentz metric); in the sequel, we shall stick to this covariant description, and 
by ‘““momentum’”’ always mean the four-momentum. Every collision satis- 
fies the law of conservation of the momentum, which says that the sum of the 
momenta of the incoming particles equals the sum of the momenta of the 
outgoing particles. 

It is convenient to represent a collision process by a graph: for instance, 
Graph 1 (Fig. 1) represents a process where three particles 1, 2, 3 collide and 
give three particles 1’, 2’, 3’. What the detailed mechanism is of the collision, 


Graph I. 


Elementary scattering graph. 


FIGURE 1 


and what kind of “‘catastrophy”’ occurs while the particles occupy the same 
region of space-time is not known, and the black box of Graph | symbolizes 
this ignorance. But imagine that by some finer experimental device one had 
been able to “‘resolve”’ the catastrophy of Graph | into a succession of finer 
catastrophies, as symbolized on Graph 2 (Fig. 2), for instance. For the time 
being we do not want to ask how this can be done, or whether it has a meaning 
or not; we simply want to ask whether this is kinematically permissible: for 


Graph 2. 


Multiple scattering graph. 


FIGURE 2 


422 FREDERIC PHAM 


which values of the ‘‘external’’ momenta (p,, P2, P35 Pi» P2» P3-) Can one 
find ‘‘ internal’? momenta (p,, P5, ps) such that the momentum conservation 
law will be satisfied at each vertex of Graph 2? Note that this question could 
be asked also in the more general case where Graph | is no longer an “‘ele- 
mentary” scattering graph, but a multiple scattering graph from which 
Graph 2 would be deduced by “* resolving’’ some vertex. To put the question 
in this general setting, denote by x: G, —~>G, the operation of “‘ contract- 
ing’’ some lines of a graph G,, thus getting a graph G,; denote by A(G,), 
i= 1, 2, the “space of the graph”’ defined as the product of the mass shells of 
all the particles of the graph, restricted by the momentum conservation law 
at each vertex; one has a canonical mapping 


S(k): S(G2) > S(G;) 


defined by “‘forgetting’’ the momenta of the contracted lines. Then the 
above quéstion amounts to asking which points of A(G,) belong to the image 
of this mapping. 

Our notations for graphs will be the following: a multiple scattering 
graph is a connected oriented graph G, whose lines represent particles, and 
vertices collisions between particles; such a graph must have no directed 
loops, so that the set of vertices will be partially ordered: this corresponds to 
the causal ordering of the successive collisions. The set of vertices will be 
denoted by V, and the set of lines by /; v(i) will be the incidence number of 
the vertex v with respect to the line 7, equal to +1, —1,or0. Z will denote 
the group of cycles of the graph, defined as the intersection of the kernels of 
all the homomorphisms 

v,: ZI) Z 
associated to the incidence functions 
vi Il-Z 


of all the vertices of the graph (Z denotes the group of integers, and Z(/) the 
free Abelian group on the set /). If z eZ, z(i) will denote the contribution of 
the line i to the cycle z (a rational integer). Recall that a base of the free 
group Z can be constructed as follows: one chooses in G a maximal “‘ tree”’ 
(subgraph without cycles); then for each line k outside this tree there is a 
unique cycle z, such that z,(k) = 1, and the z,’s form a base of Z. The space 
of the graph is defined by 


4G) = {pe TT M,| > v(i)p;=90 Vve v| 


iel 


It will sometimes be convenient to forget the mass shell conditions and con- 
sider the Euclidean space 


6(G) = {a e(R*)'|| 5 o(i)p,=0 Vove v| 


iel 


Landau Singularities in the Physical Region 423 
Then, introducing the polynomial map 
s: &(G)—R"! 
D> (Di )ier 
one sees that Y(G) isa connected component of the algebraic variety s~ '((m,7)). 


If (m;*);-,is a regular value of the map s, this algebraic variety is a manifold. 
We thus get Proposition 1. 


Proposition 1. For almost all values of the masses, the space Y(G) is a mani- 
fold. 


It is instructive to write explicitly the condition which makes the map- 
ping s critical. Taking for coordinate basis on &(G) the system of momenta 
(p,.) associated to the complement of a maximal tree, one finds for the tangent 
map to Ss: 


0S; OP; 


ap = 2p; Op 


= 2p\2(i) (vu =0, 1,2, 3) 


Therefore, the mapping s will be critical if and only if one can write a linear 
relation with not all vanishing coefficients: 


> % PiZx(i) = 0 Vk 
iel 


that is, 
Yapzi=0 VzeZz 
iel 


EXAMPLE. In the case of Graph |, the group of cycles can be generated 
by 1 —2,2—3,1+ 1’, 1’ — 2’, 2’ — 3’, so that the above relations read 
O1P; — 42 p2 =9 
Ot, P2 — 43 p3 =9 
Op Py +O py =0 
Oy Pir — Hp Pz = 0 
2 P2-— 3 pz = 0 


simply meaning that all the momenta of the particles are parallel to each other. 
This implies the following relation among the masses 


my, + My + M3 = My + My: + My: 


Now we would like to study the mapping 
S(k): (G2) A(G)) 


424 FREDERIC PHAM 


associated to a contraction 
kK: G,—>G, 


We shall denote by G the subgraph of G, formed by the lines which we want 
to contract, and by J, V, etc. the set of lines, vertices, etc. of this subgraph G. 
Let us first enounce the obvious 


Proposition 2. The map S(xK) is proper (that is, the reciprocal set of any 
compact subset is compact). In fact, a closed subset of the mass shell is 
compact if and only if the energy p® is bounded. Thus, Proposition 2 simply 
means that the boundedness of the energies of the “‘external’’ lines (that is, 
the lines of G,) implies the boundedness of the energies of the “‘internal’”’ 
lines (lines of G), a result which easily follows from the conservation of energy 
at each vertex. 

If the masses are chosen noncritical, so that both Y(G,) and Y(G,) are 
manifolds (see Proposition |), one can speak of the tangent mapping to (xk), 
and look for the critical points, that is, the points where this tangent mapping 
is not surjective. Explicitly, a tangent vector to A(G,) can be'represented 
by a tangent vector X to &(G,) subject to the restrictions ds ,(X) = 0, where 
s;=p;’,jé1,. Let us compute Ker dY(x), the kernel of the tangent map- 
ping: it consists of the vectors X whose components along the “external ”’ 
lines vanish; then the equations ds, X) = 0 are trivially satisfied for je /,, 
and need only be written for j € J, where they read 

Os; ; 
ds ({X) = > an “X,=2 >, Z(t) Pi Xx 
k ODP, k 
where the four-vectors X, denote the components of X along the lines k, 
complement in J of a maximal tree of G. Thus 


Ker a(x) = [X15 z,(i)p;° X, = 0 Viel 
k 


As easily verified, saying that dY(k) 1s not surjective is equivalent to saying 
that the above conditions for its kernel are not independent, that is, one has 
a relation 


oy O; DP; Z,(i) = 0 V k 
iel 


that is, 
Yapzi)=0 VzeZz 
iel 


The not all vanishing parameters «; are called the Feynman parameters, and 
the above equations are the Landau equations. Note that they are formally 
identical with the equations written in connection with Proposition 1, but the 
meaning of the letters J, Z is different, since they now refer to the subgraph G 
of the lines contracted under ck. We have thus proven the following: 


Landau Singularities in the Physical Region 425 


Proposition 3. A point p'?e Y(G,) is critical for A(x) if and only if there 
exists a system of not-all vanishing Feynman parameters (a,) satisfying the 
Landau equations. The possible systems of Feynman parameters form a 
vector space whose dimension is the corank (at the target) of the critical 
point. 


We shall see in the second part that the Feynman parameters can be 
interpreted physically, up to a factor, as “time delays’’ between the creation 
and destruction of the corresponding particles. Thus the critical points where 
the Feynman parameters can be chosen nonnegative will play a privileged 
role. We shall call them relevant critical points. If, moreover, the corank 
is | (that is, the Feynman parameters are unique up to a proportionality 
factor), and if the Feynman parameters can be taken all strictly positive, we 
shall say that we have a leading critical point. The image in (G, ) of a critical 
point will be called a Landau point, and it will be called a relevant resp. leading 
Landau point if it comes from a relevant resp. leading critical point. 

Now comes the first nontrivial proposition. 


Proposition 4, Near a leading critical point, the mapping A(K) has the same 
local analytic type as the suspension of a Morse function of index zero, that 
is, one can choose local analytic coordinates x,, x,,...,X, in A(G,) and 
Vis V250+++9¥_ in A(G,) such that A(x) reads: 


Yi =X 


Y2=%X2 


Yn-1 = Xn-1 


Vn =X + Xhap tot Xm 


COMMENTS. The last line above is the local model for a Morse function 
Va(Xn> Xn+19+**> Xm) Of index zero (cf. Bott’s lectures). The “‘suspension”’ 
Operation (in Thom’s terminology) consists of increasing the dimension of the 
source and the target in a trivial way by adding the same set of variables. 
The resulting type of mapping is called S' in Thom’s classification of singular- 
ities. Note its very simple features, which will be useful when we come to the 
physical interpretation: the Landau set is a manifold of codimension | (the 
manifold y, = 0); every Landau point is the projection of only one critical point 
(at least in the neighborhood considered); finally, the image of A(x) is all on 
the same side of the Landau manifold (y, = 9), which can therefore be inter- 
preted as the “‘ threshold of the multiple scattering process.” 


SKETCH OF THE PROOF. The Morse singularity of a function is character- 
ized by the nondegeneracy of a quadratic form, the “ Hessian ”’ of the function, 


426 FREDERIC PHAM 


defined by the matrix of second partial derivatives. Similarly, in the case of 
a mapping having a critical point of corank 1, one can define on the kernel of 
the tangent map a quadratic form called the “‘ transverse Hessian ”’ (defined up 
to multiplication by a nonzero number), whose nondegeneracy will character- 
ize S'-type singularities. The ‘“S'-type with zero transverse index”’ will be 
characterized by the positive-definiteness (or negative-definiteness) of the 
transverse Hessian. I have no time here to give a formal definition of the 
transverse Hessian, I simply indicate how it is calculated in the present 
situation: it is the quadratic form 


H(X) y 07s; 
= 6st 
isk, ask’, a’ Op,” Opy" 
= 2, Z(U) ZAI X, Xp 


=)) a; Y,’ 
i 


pee, 6a 


where we have put 
Y,= »s Z,(1)Xy 


Now the fact that X e Ker dY(xk) is expressed by the relations 
p;' Y;=0 Viel 


(See the proof of Proposition 3.) Since the p,’s are timelike vectors (p;” > 0); 
this implies that the Y; must be spacelike. Therefore, for positive Feynman 
parameters, the transverse Hessian H(X) =) a,Y;’ is negative-definite (the 


Y, cannot all vanish unless X = 0), and Proposition 4 is proved. 


EXERCISE. Show that Proposition 4 still holds true when the Feynman 
parameters are allowed to vanish on a tree of G. 


Proposition 4 gave only /ocal information. But some global informa- 
tion can very easily be obtained in the following way. Let p°€ AY(G,) be 
relevant critical point of arbitrary corank, and choose a system (a;) of non: 
negative Feynman parameters satisfying the Landau equations at p*. On 
the Euclidean space &(G,), consider the linear function 

t(p) = De a(p; — Pi) * Pi 
It vanishes at p°, and is nonnegative on Y(G,): indeed, none of the terms 
(p; — pi‘): p;* can be negative when both four-vectors p;, p;° belong to the 
same mass-shell. Therefore {t#(p) = 0} is a supporting hyperplane for S(G,) 
in &(G,) (and even for A*(G,), the space deduced from SA(G,) by forgetting 
the mass shell conditions for the external lines—which play no role in the 


Landau Singularities in the Physical Region 427 


above reasoning). Now it immediately follows from the critical character of 
p°—and it can also be checked directly on the Landau equations—that this 
hyperplane is the reciprocal image of a hyperplane in &(G,), that is, the 
function f(p) depends only on the external momenta. This function can there- 
fore be considered as the equation of a hyperplane in &(G,), ‘‘ supporting”’ 
the projection of S*(G,). We thus get the “‘convexity’’ result: 


Proposition 5. Through every relevant Landau point there passes a hyper- 
plane which supports the projection of /*(G,). In particular, it will support 
the Landau set. 


Further Problems 


1. It would be interesting to investigate the possible topological struc- 
tures of critical points of corank higher than one. In contradistinction with 
the corank | case, they do not seem to be generic. 

2. ‘‘Hierarchy’”’ of the singularities. 

If some of the Feynman parameters vanish, one can contract the 
corresponding lines, thus getting another graph G, for which the given point 
will still be critical. One is thus led to study the singularities of a mapping 
given by the composition of two mappings: 


S(G2)—— S(G3) 


S(G)) 


The following example of a ‘“‘ composed singularity’ is a simple version of a 
situation actually encountered in physics: let us consider the following 
composed mapping of the plane into the plane 


: _ 2.2 
X22. Yr2=F%X2 


; a es eae | + Y2 
oe: 
Y2 22=)1 

_ 2 

xy Z, =X, t+%X2 
; a 2 
X22 22=%1 


428 FREDERIC PHAM 


Intuitively, the mapping f can be thought of as ‘“‘folding’’ the x-plane along 
the (x, = 0) axis and projecting it on the y-plane, with the fold projected on 
(v2 = 0); then g “‘folds”’ the y-plane along another axis (y, = 0) and projects 
it on the z-plane, obliquely with respect to the image of the previous fold. 
Figure 3 shows the image of the composed map in the z-plane; the Landau 
set (set of critical values) consists of the (z, =0, z, >0) half-axis and a 
parabola tangent to it (z, = z,”). (This parabola is nothing but the image of 
the first fold.) This is the simplest example of what physicists call an 
** effective intersection”? of Landau curves. For further study, see [5]. 


FIGURE 3. Image of acomposed map. The hatched zone is covered two times, 
the cross hatched zone four times. 


2 THE PHYSICAL MEANING OF LANDAU SINGULARITIES 


The interest in physical region singularities was aroused recently by 
the following very simple remark of Coleman and Norton: Consider a mul- 
tiple scattering process, for example, the one represented by Graph 2,' and 
imagine an ideal case where the various collisions would occur at points 
v,.v’, w of space-time; then the four-vectors a, = vv’, a, = vw, as = wo’ 
represent the space-time intervals traveled across by particles 4, 5, 5’, and must 
be proportional to the momenta of these particles: 


a; = &; DP; i= 4, >; 5’ 


where the proportionality factors «; can be interpreted as a; = t;/m;, where 
t; is the proper lifetime of particle i, measured in its rest frame. (Write 


1 For the clarity of exposition, all the reasoning of this second part are made on the 
specific example of Graph 2, although the generalization to an arbitrary graph would be 
quite straightforward. 


Landau Singularities in the Physical Region 429 


a; =1,p,/m;, and note that p,/m; is the ‘‘four-velocity”’ of particle i.) With 
these notations, the Landau equations a, py =a, p, + a5. ps5, become a, = 
a,;+as, and can be interpreted as the obvious necessary and sufficient 
conditions for the space-time diagram (v, v’, w) to exist. Furthermore, the 
positivity of Feynman parameters simply means that the 1; are positive, that is, 
the particles cannot be destroyed before they are created. 

Of course the above-considered ideal case goes against the laws of 
quantum mechanics, since we have been obliged to assign precise momenta 
and positions to the particles. But, anyway, we can hope to make sense out 
of it in the “‘macroscopic’’ limit, when the distances between the successive 
collisions are big with respect to the imprecisions on the positions. More 
precisely, let the wave functions 9, of the incoming (j = 1, 2, 3) and outgoing 
(j = 1’, 2’, 3’) particles be sufficiently sharply distributed, in the momentum 
representation, around some mean values p,° defining a leading Landau point 
(p;°) of Graph 2. 

By Proposition 3, the datum of (p,°) determines unambiguously (locally 
at least) a system of internal momenta p,° (i = 4, 5, 5’) satisfying the Landau 
equations, with well-defined (up to a multiplicative factor) Feynman param- 
eters; it therefore determines, up to an arbitrary scale factor t, the space- 
time separations a; = t«;p;° at which we can hope to see the successive 
collisions occur, if they occur at all. Therefore, supposing the wave functions 
~; all overlap in some region of space-time, we can hope to make the multiple 
scattering more likely by performing different space-time translations on 
them, differing precisely by the above given separations a; (for big enough 17); 
explicitly, 

Particles 1 and 2 will undergo a translation a, 
Particles 1’ and 2’ will undergo a translation a,, 
Particles 3 and 3’ will undergo a translation a,, 


with a,—a,=a4,, A4y—a,=a5,, ay—a,=a,. We then expect that 
asymptotically, for big space-time translations chosen in this particular way, 
the scattering amplitude of the process (Graph 1) will not decrease as quickly 
as it would for arbitrary space-time translations, and will behave like the 
product of the amplitudes associated to the vertices of Graph 2, integrated 
over all possible states of the intermediate particles. 

Let us put this in equations. In the momentum representation, we 
are given a distribution S(p,, P2, P33 Pi'> P2> P3-)—the integral kernel asso- 
ciated with the scattering operator S (by Schwartz’s “nuclear theorem’’). 
The scattering amplitude reads 


CP1 @ 2 © G3'|S|M; © G2 @ G3) 


= S 9 9 9 9 9 
i, ae on] (Pi, P2> P33 Pi» P2'> Ps’) 


X D1(P1)P2(P2)P3(P3)P1(P1)P2(P2)P3(P3") (1) 


430 FREDERIC PHAM 
and for the translated states it becomes 
Cpt" © 92" @ G31 S lot © g2 @ 3”) 
= | [same integrand ] 


x exp i[(Py + P2)dy + (P3 — P3)* Gy — (Py + P2)* ay] (la) 
where the notation ¢* is used for the state deduced from @ by the space-time 
translation a 

p°(p) = 9(p) x expip:a 
We would like to get for this amplitude the following asymptotic behavior 


d°p 
| I] —~\S(P1; P23 Pas Ps) 
h 


h=1,2,3 2D 
5 


x S(p3, Ps; P3’> Ps')S(P4s Ps'3 Pius Pr’) 

X Pi(P1)P2(P2)P3(P3)P1(P1)P2(P2)P3( Ps’) 

x exp iL Pg 44 + Ps*' a5 + Ds * a5] (2) 
which expresses the factorization of the processes rv, w, v’; the phase factors 
exp i(p;: a;) in the integrand represent the evolutions of the particles 4, 5, 5’ 
between their birth and their death. 

So how can we get from (la) to(2)? Aneasy calculation shows that the 
phase factor in (la) is precisely equal to the phase factor in (2) for any internal 
momenta p,, P;, Ps Satisfying momentum conservation at each vertex. But 
note that 

Ds Pi a,=Tt > Ci Pi’ Pi 


i=4, 5,5’ i=4,5, 
is exactly—up to a constant phase which can be taken out of the integral—t 
times the function 
t(p) = 2 ou Pi— Pi) * Pi 
considered in Proposition 5. Remember that this function depends actually 
only on the external momenta, and can be thought of as the linear approxi- 
mation to the equation of the Landau manifold. Thus, | 


(It) = | Integrand of (1) x exp itt(p) 


an expression whose asymptotic behavior for t > o0 will clearly depend only 
on the singular behavior of the integrand of (1) near ¢ = 0, that is, near the 
Landau manifold. 

We shall not push the reasoning further, but stop at this conclusion: 
The macroscopic “multiple scattering behavior”’ of a collision process is deter- 
mined by the singular structure of the scattering operator on the corresponding 
Landau manifold. For further details, the reader is referred to [5], where it 


Landau Singularities in the Physical Region 431 


is shown how a singular structure for the scattering operator can be postulated, 
which leads precisely to the “* multiple scattering”’ behavior (2). 

To conclude this talk, let me briefly sketch the various points of view 
currently adopted by physicists on ‘‘ physical region Landau singularities.” 
To begin with, the word ‘singularity’? can be understood in two different 
ways: the differentiable sense and the analytic sense. On the other hand, one 
can distinguish essentially three kinds of physical approaches to the problem 
of singularities. The first is the study of Feynman integrals, which are gener- 
ally considered to be good models as far as singularities are concerned; all 
the analyticity postulates required by [5] can be shown to hold true for Feyn- 
man integrals in the physical region, so we are happy. The second approach 
is the so-called ‘‘S-matrix theory,’’ which tries to build a theory of the 
S-operator directly, without using field equations etc. Since the choice of the 
‘* §-matrix axioms’”’ is more or less left to the discretion of the theorist, one 
could of course just postulate the analyticity properties formulated in [5]. 
However, S-matrix theorists do not like that because these properties are 
very redundant, both mathematically and physically: mathematically, because 
much can be deduced on the singular structure of a graph from the singular 
structure of other graphs, by reasoning from the ‘“‘composition of singulari- 
ties’’ (see the end of Part 1); physically, because the unitarity of the S-operator 
imposes extraordinarily strong restrictions on its singular structure, so that 
it seems very difficult to imagine a unitary S-operator with analyticity proper- 
ties different from the above mentioned. The third approach is “general 
quantum field theory”’ or “* axiomatic field theory,” surely the most satisfactory 
from a mathematical point of view. Although this theory is very difficult to 
handle, it has already given some encouraging indications concerning the 
asymptotic behavior. 

The following table summarizes the present situation. 


Feynman S matrix theory Field theory 
AN 


HEPP: study of the 
double scattering process 


IAGOLNITZER, WANDERS 
(Suitable hypotheses on the 
differentiable structure lead 

to the correct asymptotic 
behavior.) 


Differentiable 
structure 


OLIVE, POLKINGHORNE, 
STAPP, ... 

(analyticity in connection 
with unitarity) 


O.K. (cf. 5) 


Analytic structure 


432 FREDERIC PHAM 


REFERENCES 


1. S. Coleman and R. E. Norton, Nuovo Cimento 38, 438, 1965. 

2. R. J. Eden, P. V. Landshoff, D. I. Olive, and J. C. Polkinghorne, The 
Analytic S Matrix (Cambridge Univ. Press, 1966). 

3. K. Hepp, J. Math. Phys. 6, 1762, 1965. 

4. D. lagolnitzer, J. Math. Phys. 6, 1576, 1965; and Thése, Paris, 1967. 

5. F. Pham, Ann. Inst. Henri Poincaré, V1, No. 2, p. 89, 1967; and Symposia 
on Theoretical Physics, Vol. 7 (Plenum Press, New York), to be published. 

6. G. Wanders, Helv. Physica Acta 38, 192, 1965. 


XVI 


Algebraic Topology Methods 
in the Theory of Feynman 
Relativistic Amplitudes’ 


TULLIO REGGE 


PART I 


In this lecture I would like to introduce you to the spirit and to some 
of the technicalities of the work currently being carried out by Lascoux and 
myself on Feynman relativistic amplitudes (FRA). 

FRA are analytic functions of a very special kind which have been 
defined in connection with relativistic field theory. It 1s not my job here to 
discuss at length their definition, as this has been done already by Lascoux 
and Hepp in previous lectures. The reason why these functions are interest- 
ing to so many people could be summarized as follows: 

1. The scattering amplitude for any physical process can be in principle 
expressed, if field theory is right, as a power expansion in the coupling con- 
stants, each coefficient being a Feynman relativistic amplitude. 

2. In general we cannot compute explicitly these amplitudes since they 
are given by rather complicated integrals. We are therefore happy to gather 
whatever information is available on them, including analytic properties, 
which may yield, hopefully, dispersion relations. 

3. Even supposing that we know each amplitude, the following troubles 
may arise: the power expansion may not converge for some values of the 
parameters involved; the power expansion never converges and it should be 
interpreted as an asymptotic series; some of the coefficients are infinite. 


1 See also: The Analytic S-Matrix, R. J. Eden, P. V. Landshoff, D. I. Olive, and 
J. C. Polkinghorne, Cambridge University Press, 1966. 


433 


434 TULLIO REGGE 


The last trouble is taken care of by the renormalization theory into 
which I shall not enter as Hepp has taken care of it in his very clear lecture. 
About the first, and related, second trouble, we have some partial information 
out of specific models. There is a definite tendency to believe that we must . 
be prepared for the worst and that the Feynman—Dyson expansion never 
converges. However, these problems have never quite been solved and we 
possess surprisingly little information about each coefficient in the expansion. 

The question more often asked is: What shall you do when you know 
in detail the analytic properties of each FRA? The answer is: I do not know. 
In fact, what will be done if we ever arrive at that point largely depends on 
what we shall find. In general we would welcome theorems on classes of 
diagrams which have more chances of being applied to the scattering ampli- 
tude itself. It might well be that after much work we find that we cannot 
make any sense out of all individual contributions and that the whole project 
has to be dropped. But let us move close to the FRA. One such FRA is 
given (Fig. 1) when we give the corresponding Feynman diagram, for example, 


FIGURE 1! 


together with all the information regarding spins and masses of all the 
relevant particles. Usually we limit ourselves to spinless particles because of 
the simpler combinatorial rules. In principle, however, we think that the 
arbitrary spin case is in no way more complicated and in fact once that com- 
binatorics is worked over, it would be desirable to discuss all spins at the 
same time. 

We shall sketch here briefly the relevant terminology and the Symanzik 
rules. A Feynman diagram & is a closed connected one-dimensional simpli- 
cial complex; in detail: 


Algebraic Topology Methods and FRA 435 


(A). A set {P;,/...q} of O0-simplexes, called vertices of the diagram. 

(B) A set 9 = {L;,/... 2) of l-simplexes called lines of the diagram. 

(C) A boundary operator 0 which assigns to each oriented line L, the 
difference P, — P, of two points. We restrict ourselves to diagrams where 
no line is closed. We construct the incidence matrix 6, j defined by 
OL; =) €;;P;. 

(D) We call external those points which belong to the boundary of one 
line only. A line having an external point on the boundary is called external. 
It is clear that no line has two external points on the boundary or else the 
diagram would not be connected. We avoid disconnected diagrams because 
they reduce trivially to the discussion of their connected components. 

For the same reason we shall suppose that the diagram does not become 
disconnected upon removal of one line; this class of diagrams can also be 
reduced to simpler diagrams. 

Similarly we shall suppose that every internal point is on the boundary 
of at least three lines. 

From what we say it should be clear that these restrictions are here just 
labor-saving devices and that it is perfectly feasible to define amplitudes even 
if they are not satisfied. 

We introduce the numbers: 

p: number of internal vertices also called the perturbation order of the 

diagram. 

n: number of internal lines of the diagram. 

/:; rank of the one-dimensional homology group of the diagram, shortly 
H,(Z), sometimes vaguely referred to as the number of loops in the 
diagram. 

As —/+1 is the Euler characteristic of 9, we have /=1+7-—p. 

Next we consider a space R” and the linear functions k: 


k 
C,(2) > R” 


from the group of cochains C,(Y) to R”. Physically we think of the 
particular value m = 4, and R* is the so-called momentum space; by intro- 
ducing a pseudo-Euclidean metric of signature |, 1,1, —1 we derive the 
standard interpretation of the theory according to special relativity. From 
the mathematical point of view there is nothing to be gained by this special 
choice at the moment and, therefore, we shall proceed in the general case. 
This is not to discount the fact that, 1f we want some physics at the end, we 
had better worry. about the consequences of our choice. We further restrict 
our functions k by introducing the so-called conservation of momentum 
defined as: 


K( > 8, L,) =0 (1) 


436 TULLIO REGGE 


where &;,; is the incidence matrix. We introduce further a nondegenerate 
quadratic form in R”™ which, by a suitable choice of basis in R”, can be 
written always as 


(k)? = Y eae? e,= +1 
It is understood that whenever we specialize to Minkowski space this form 
reduces to the standard Lorentz metric form. Suppose now that we fix 
k(L;) =k;  iexternal °. i, 
It is clear that all functions k satisfying (1) and 
;=0 (2) 
form a linear space R'". That the dimension of this space is just /m follows 
from the fact that &k is determined by its values on the internal lines: 
k(L;) i internal 
subjected to the constraints: 
k( ¥ Li) =F 8 yk(L) = 0 
There are m(p — 1) constraints only since we have the identity 
> Y &; K(L;) = 0 
which follows from -_ 


», Fi; = 0 
j 


The functions k satisfying (1) and (2) form therefore a linear space 
R™ of dimension m(m—(p—1))=/m. In this space we introduce the 
usual Euclidean measure du. It is also evident that two functions with the 
same k and obeying (1) differ by an element of R'”. Let p be one such a 
function and let p(L,) = p;, i external. Let 9 have ¢ external points and of 
course also ¢ external lines. The ¢ external values p; are not all independent 


because of 
Ds » 6; p(Li) = 0 and », 6i; = 0 (3) 
jint. 3 j 
if follows: 
L LsyrL) = — Y LY Fijp(L) = 0 (4) 
jJint. t jext. i 


Equation (4) contains external lines only because they are the only ones 
which connect external points. This is well known as conservation of 
momentum. We define then 


1 
A(p; °°" Py, My *°* M,) = = du(k) |] (K(L,) + p(L,))* — m,? + ie (5) 


Algebraic Topology Methods and FRA 437 


As the quadratic form (k)? is invariant under a m(m — 1)/2 parameter ortho- 
gonal (or pseudo-orthogonal) group, the FRA will be a function of the 
invariants which can be constructed out of the p,. 

It is possible to exhibit this dependence more explicitly by going over 
an alternative representation of the same FRA according to a method 
originally devised by Feynman himself. I cannot report this method in 
this supposedly short lecture and, anyway, Lascoux has discussed it in some 
detail. 

We introduce n complex variables z = {z, ... z,} one-to-one corres- 
pondence with the lines of 2. We use the same notation for a subset of D 
and the corresponding subset of variables. Consider all the subsets & of 
QY consisting of / internal lines and such that by removing them Y becomes a 
tree; in the words of Lascoux @ is a cotree. Consider the function: 


cotrees 


d(z) = yy I] 2; 
€cDM ies 
As @ is of dimension 1, there are no 1-boundaries and all closed chains are 
cycles. We have the property: 


A. Let ye A,(Q). Let y be of the form 


y=DYeL e=+!1 
ieYy 
Then d(z)=0 on the set z, =0, i¢ Y. In fact every monomial in the 
defining expansion of d(a) vanishes on the set z,; =0,ie Y. In the sequel we 
shall name CP®’ the set z, = 0, ie Y, where YW’ = D-Y. 


PROOF OF A. Suppose @ exists such that | ];-4 2; does not vanish on CP”. 
Then the cotree & has no line in common with Y else this common line would 
give a vanishing factor in|]; z; and the whole monomial would vanish. 
Therefore, by removing & the diagram QY, contrary to the definition of 
cotree, has still at least the cycle Y, this being absurd A must hold. 
Similarly, we define the function D(z, s) as follows. We name @ cut if 
@ is a subset of Z made of / + 1 internal lines and if by removing it J becomes 


disconnected into two trees. D—-@=QV, UQ,. With each cut we asso- 
ciate the invariant 


Ko)= (FE mbs)*= (,E oldu) 


jeg, JE D2 


By conservation of momentum s(@) depends on the external p; only. Simi- 
larly, we introduce the monomials z(@) = | ]ie¢z;. We define then 


D(z,s)= > s(@)z(@) — d(z) > m;7z; 


EoD 


438 TULLIO REGGE 


A few remarks before proceeding further: Different cuts may yield the same 
invariant s(@). Secondly, all the z(@) and hence D(z, s) vanish on CP™ like 
d(z). Finally, and this is trivial but worth pointing out, we can choose the 
invariants and masses in such a way as to make D(z, s) linear in all these 
variables. We suppose that this part of the job has been done already and 
that we have a complete list of all the relevant variables and related Symanzik 
polynomials: 


D(z, 8) =¥ s,D,z) ens 


Because of the relative simplicity of this linear dependence we prefer to work 
out the analytic properties of the FRA starting directly from the alternative 
definition mentioned before, which can be written entirely in terms of d(z) 
and D(z, s) as follows: 


dz, +++ dz, d(z"~m@t 2 7 
Ais) = ¢ [ELS 8(¥ 21-1) (6) 


Here we use directly the invariants s; only. This choice is strictly a matter 
of personal taste and I do not intend to deny here the validity of other attempts 
based on Eq. (5). 

We shall suppose that all the integrals in (6) converge since a discussion 
of divergent integrals according to renormalization theory is well outside our 
goal in this lecture. 

In (6) we may consider the z; as complex variables. We consider also 
CP"~' the complex projective space of homogeneous coordinates z;. In this 
space we introduce the linear system of algebraic varieties D(z,s)=0. A 
generic element of the pencil will be written as W, or simply W if no confusion 
arises. The linear variety defined by z; = 0, where i € x < D will be written 
as CP”. A natural way of bringing in the complex projective space structure 
is provided by p. 298 of Gel’fand, Vol. I. This we shall see in detail in 
Part II. 

In what follows we shall suppose m even sothatn — m//2,n — m(/ + 1)/2 
are always integer. 


PART Il 


Consider the Grassman algebra of exterior forms in the differentials 
dz,;. We introduce some abbreviations. Let @ denote as before the set of 
all lines in Z. Let ox, B be generic subsets of 9. We write then 


Wey = dz; A dz;,A°** Adz, W= {i++ iy} || = q 


Algebraic Topology Methods and FRA 439 


where A is the usual anticommuting product of the Grassmann algebra, and 
the i, run from /, ... i, in increasing order. Let 


A, = {ie S,i<k} 
0 ke 
co ae 
MG k - 
(-1) kee 
We have then 
AZ, N Wey = Wey uk Eke 


We also introduce the form 


i >. Zi; Vy -i Fi, -j 
ied 


Let d be the boundary operator. By explicit computation we find 
Ang = | S| Woy 
We have also: 
AZ; A Neg = 23D — &), og Nog vi 
In the sequel we shal] restrict ourselves to forms with coefficients which are 
rational and homogeneous functions of the z singular on some fixed algebraic 
variety W< CP"~'. If the coefficients are homogeneous of degree —g and 
the form is given by 
af — yy Ry Oy 


|ef|=q 
we say that the form «’ is affine. It is easily checked that the boundary of an 
affine form is itself affine. If furthermore «’ can be expressed in terms of the 
n as follows: 


at = 3 Tana 
|\@l=qr+1 


and a? is affine we say that «? is a projective form. It is also true but much 
less obvious that the boundary of a projective form is also projective. This 
is best seen by introducing the operator z: 


>: R oy Weg os Bog Neg 
The operator z is of degree —1. After some complicated but otherwise trivial 
manipulations one finds the identities: 

z7=0 =2zd+dz=0 
It should be pointed out that z maps affine forms into projective (and a fortiori 


affine) forms and that its kernel is the set of al] projective forms. The tempta- 
tion to construct a cohomology theory out of z can be resisted by noting that 


440 TULLIO REGGE 


the ensuing theory is trivial, all z-cycles are z-boundaries. Any projective 
form a? can be written therefore as a? = zB’*', where f1*! is affine. It 
follows that 

d(zp2*') = —z dptt! 


so that da’ is itself projective, as expected. If a? is closed by definition we 


have: 
dat = z dpt*! =0 


Grothendieck has shown that the cohomology theory constructed out 
of these forms yields cohomology groups isomorphic to the usual de Rham 
cohomology theory in CP""!'— W"-?. Using the quoted Grothendieck 
result we shall simply write 


CCP", w"~?), Z(CP"~! a w"-?), 
BYCcP""! _ w"-?), HCP"! = w"- 7) 


respectively for the groups of projective (g-forms, q-closed forms, g-bounda- 
ries, g-cycles), respectively. 


In what follows the forms are of the kind: w"~' 


= nR(z), where shortly 
n=No ' 

will be the most interesting for us. They are closed since there is no form of 

higher dimension. They can obviously be written as: 


n-1 _ 4Q(Z) 
4) = 
D'(z) 
where D(z, s) is the Symanzik polynomial introduced in Part I, if we select 
W to be the variety D(z, s)=0. Consider now the integral: 


| w"! 
O 


where O is an open set on a sufficiently smooth hypersurface 2 in C” — 0. 
The important fact is that the value of the integral depends only on the image 
of O in the map C" — 0 > C" — 0/(C — 0) which defines CP"~'. In order to 
show it let QO’ be another such open set on another hypersurface X’ having the 
same image in CP""'. We form a closed chain of integration by the com- 
bination T= O — O’ + A, where A is the set obtained by joining points on 
60, dO’ having the same image in CP"~' with a segment. The integral of 
the form on T is obviously zero since the form is closed; on the other hand, the 
integral of the form along A is also zero because such an integral vanishes on 
any subset of a cone in C”"—O as one can check easily. Therefore 
fo wm"! = fo: w" 1 

This final result really means that the integral of a projective form can 
be considered as an ordinary integral over the complex projective space. In 


P integer, Q of degree (1+ 1)P—n 


Algebraic Topology Methods and FRA 44] 


the applications which we have in mind we shall integrate over the simplex 
S?:z;>0,i¢€Q. Which hypersurface we select in C" is then not critical. 
In order to carry out the integration, we choose then the plane Z: " z, = 1. 
It follows that on Z: 


dz,= —) dz, 
1 
and 
n-1 
n=Na= |\ dz 


Finally we have the identity 


nQ(z) = dz, °° dz,-, 
des D*(z, s) - = Dz ‘ aia Q(z) 


Fz 
Ez =1 
1 


7 { dz,-++ dz,6()4 z; — 1)Q(z) 
~ hee D?(z, s) 


and by selecting Q(z) = d(z)"" ‘"'*!/?™ we may express the integral in (6) in 
the form: 


n ran 
A(s) =C i samme Hey t 2 (7) 


This we like because it brings directly into evidence the underlying complex 
projective structure in the integral. By looking at it we see two possible 
alternatives: 

(A) The variety W is a general variety of degree / + 1 and is in general 
position with respect to the simplex S. 

(B) The variety W has fixed singularities and/or contains some of the 
subsimplexes of S. This will be the general situation as we shall see. 

In order to solve case (B) we need to discuss case (A) in advance. 
Diagrams with / = 1 belong in general to case (A) and diagrams with / > 1 do 
not. However, /= 1 also implies that the degree of D is 2 and this is too 
much of a simplification. 

The best approach is to study the large class of fictitious FRA, that is, 
integrals of the form (6) which do not arise from (4) such that W is of general 
degree and in general position. These FRA can be investigated quite in 
detail. Once we know something about them we shall impose the specializa- 
tions required by the physical applications. 

Before we go into some detailed discussion of these integrals, let us see 
which results of significance to us are already known in the previous literature. 

We may summarize them as follows: 

(I) A(s) is a multivalued analytic function of s. 


442 TULLIO REGGE 


(II) The singularities of A(s) lie on an algebraic variety FY (Landau 
variety) in the variables s. We suppose here that all the s are finite and not 
all of them zero. It is a trivial result that A is homogeneous in the s of degree 
ml/2 —n. It is therefore sometimes convenient to regard the s as homo- 
geneous coordinates of a complex projective space CP"”~' and the Landau 
varieties as varieties L in this space or equivalently to sy = 1. 

(III) The FRA (fictitious and not) are generalized hypergeometric func- 
tions in the following sense. We select a point P (base point) not on L anda 
neighborhood O of P not intersecting L. Construct the fundamental group 
of the complement of L, that is, 7,(CP*~!—L) with base P. Define the 
value of A in O by means of the integral (7). To each element g of the 
fundamental group we may associate then a definite analytic continuation of 
A along a representative loop of g and this continuation will depend on the 
homotopy class g of the loop only. This continuation will be a new function 
gA also regular and defined in O. Then the property in question states that 
there exists a fixed finite set of elements of z,(CP*~' — L): 9, ....g, such that 
any gA is a linear combination with constant coefficients of the g,A ... 9g, A. 

Let A, ... Ay be any linearly equivalent set to the g,A...g,A. We 
have then 


K 
gA; = ny 4, (g)A; 
j 


where the A;; are constants. It is evident then that the map g — A; (g) yields 
a representation of the fundamental group by means of K-dimensional 
matrices. 

We may define our goal as the explicit construction of z,(CP — L) and of 
the corresponding representation. By explicit construction we mean a pro- 
cedure by which a presentation of 2,(CP — L) is given and such that to 
every loop with base point P an element of the fundamental group Is explicitly 
assigned. In the known cases the fundamental group is extremely compli- 
cated and one succeeds in deriving partial results only. Very helpful in this 
direction is the remark that property III is equivalent to a set of differential 
equations which reduces to the ordinary differential equation for the hyper- 
geometric functions case. 

This equivalence has been pointed out to us by Parassiuk in Kiev and 
we think that it was known as far back as Schlaefli. The reasoning goes as 
follows. We keep all variables fixed but one, say s,. We form the 
K x (K + 1) matrix: 


A,(s) "++ Ax(s) 
dAy,. Ax 
ee ds, (s) 


A‘*)(s) pits AbK)(s) 


Algebraic Topology Methods and FRA 443 


Let A’(s) be the determinant which one obtains by deleting the row of the 
pth derivatives in the matrix. Since the effect of g on the matrix is to replace 
each column with a linear combination of columns themselves according to 


g Ai"(s) = > J; (s)AY(s) 
j 
the effect on it is simply: 
gA"s) = (det 4,,)A%(s) 


therefore A?(s)/A°(s) remains unaffected and is a rational function of s. 
Consider now the determinant: 


A,(s) +++ Ax(s) ACs) 


A\(s) oe A‘*)(s) A‘*(s) 
In this determinant the last column is a linear combination of the others, 


therefore it vanishes. By expanding it along the last column we find: 


A‘?(s) 
A's) 


K 

Y A%s) a (-) = 0 
p=0 
This is the required differential equation of order K. It is a very easy matter 
to verify that the existence of the differential equation is equivalent to Property 
III. There are of course as many differential equations as variables. The 
knowledge of all these differential equations would imply practically the final 
goal. These differential equations are al] ordinary equations in the sense that 
in each of them we may regard all variables but one as fixed parameters. All 
together their general integral does not depend on arbitrary functions as in 
the theory of partial differential equations but just on K constants as expected. 
We shall devote the rest of this lecture to showing how one can derive these 
equations in principle from (7). These equations are in fact just a slight 
generalization of the well-known Picard—Fuchs equations. 


PART If 


In this section I shall sketch the derivation of the differential equations 
from (7) in the case (A). For this purpose we consider all forms of the kind 


_ nH(z) 
w= DP 


(8) 


since P can be chosen arbitrarily high, the set of these forms is an infinite- 
dimensional linearspace. Thefactor space of these forms modulo the bound- 
aries is, however, finite-dimensional. This means that for sufficiently high P 


444 TULLIO REGGE 


all forms are equivalent modulo a boundary to forms of smaller P. A 
reduction technique is therefore necessary if we want to evaluate the cohomo- 
logy group with purely algebraic means. 

The criterion is offered by the following identity: 


oD 1 n i, n . 
7 (SH) =)“ Prips Le +4) No 9-i DP= aeiy (9) 


Therefore, if in (7), 


2 oD 
H(z) = 2X H; az, (10) 


then the form can be reduced to one of lesser degree in the denominator 
modulo a boundary. 

The difficult part is to prove that any form equivalent to a less singular 
one necessarily reduces through the proposed mechanism. This I do not 
intend to discuss here. In collaboration with Lascoux, I also conjectured a 
similar reduction criterion to be used when D(z) is reducible. Suppose that 
D=D,D,. Then the form 


si). 
Di! D?? 
is equivalent to another less a form if 
O(D D2) 
H(z) =D, EO ai D,Y K (2)! ‘ds y. Giz) Gs = (11) 


Again here as before the difficult point is not to check that the criterion is 
sufficient, but that itis necessary. This we have to assume or else there would 
be equivalent forms but such that equivalence could not be proved by use of 
these criteria and this would be an extremely disagreeable feature for com- 
putational purposes. We suppose now that the linear system D(z, s) is given 
and that a basis for the cohomology groups has been selected. 
Suppose now 
séeL 
Let w?’""* be a set of forms yielding such a basis. We may write, therefore, 


For certain diagrams it can be verified that H"~'(CP"~' — W) reduces to 
the identity. In this case every closed form is a boundary and we may trans- 
form the integral (7) into a surface integral. The study of (A) is then reduced 
to the discussion of similar but simpler integrals coming out of the surface 
terms. In what follows we shall suppose that there is a nontrivial set of forms 


Algebraic Topology Methods and FRA 445 


according to (12). We may always select the first form to be the one given by 
the original FRA. The others will be chosen by some criterion of simplicity 
in the actual computation. It is in fact irrelevant in principle which forms we 
select. However, the actual writing of the differential equations may prove 
to be irksome following an unhappy choice. It would be interesting to see if 
it is possible to choose the other forms as those which correspond to processes 
with spin. In this case our discussion would unify the theory by including 
different values of spin at the same time. We suppose that the same classifi- 
cation of forms has been achieved for all dimensions. We define then 


A,As) = [0° a=1--: Be"! 


Of these the first integral is just the original FRA. We compute now: 
0A, (5) dw," Q,(z)(OD/0s,) 


OS, s2? OS, D?*'(z, s) 


By hypothesis D(z, s) is linear in s and therefore 0D/ds, is independent of s. 
It follows that dw/ds, is again a form in C""'(CP® — W) and that there exist 
constants (that is, independent of the z but still functions of s) M,4,(s)” and 
a form t”~? such that 


6m?"— 1 
; => Ma (sop | + dry? 
Sk B 
Clearly then 
0A,” 
a M2 As? = 
ds, d apk(S) B I. ao 


The surface integral can be written as 


n-2 D-i,n-2 
i T = | Te &1,.9-i 
ds? i ¢S9-i 


12~' is here the restriction of t"~? to CP?" 

This surface integral can be expressed as a sum of terms, one for each 
subsimplex on the boundary of S?. Let S¥ = CP® n S?,W*% = CP* 2 W. 
We also label forms and their integrals on CP” with the label o. We 
introduce now a corresponding basis of forms on CP® : w?, a = 1 +++ BY?. 
Let the corresponding integral be written as 


As) = [one a=: Blvl-} 


By assumption we can always write 


BI -in-2 


qa aes = y M?-2-"(s)op—'n~? + do" > 
y=1 


446 TULLIO REGGE 


The surface integral in turn can be expressed in terms of the corresponding 
w,~, & < QZ and of other surface integrals of lesser dimension. The process 
is then repeated until we reach the lowest dimension where it trivially termi- 
nates. In this way we express 0A,(s)/Os, in terms of all amplitudes A*(s), 
where Yc J: 
9 B®, |) -1 
ey TMM) 5) 
OS, ACD y=1 

By using the same procedure with A in place of Z we can also compute all 
derivatives 0A”/ds, in terms of the amplitudes A”, @ <.o. We may gather 
now all the components A® in a single one-column matrix N and write all 
these equations in the form: 

ON 

a M(s)N (13) 
where &/ stands for a square matrix whose elements are functions of s._ It 
turns out that these functions are rational functions of s and that their 
singularities are just the Landau varieties as expected. It we now eliminate 
all the amplitudes in (13) but A,% we arrive at an ordinary differential equation 
of order equal to the rank K of N, which is the desired equation. In practice 
we do not want to do so because the system (12) is much handier to deal with 
and provides a lot of relationships among different amplitudes. 

The order of the differential equation is, according to our investigation, 
just the number K of independent analytic continuations of A. This number 
may be very high and we may naively make it even higher if the number of 
components in the matrix N is not counted right. The reason for this is that 
we have no assurance, and in general it is false, that all the components of N 
are actually all independent. In order to count these independent com- 
ponents properly we recall the notation B”’” for the Betti numbers, that is, the 
rank of H°(CP* — W%). It follows that in w:? the index « takes the values 
1 --- B”* N has therefore in the naive computation: 

Belsi-1 
AOD 


components. You may have noted that I also introduced in the lecture the 
forms w*? belonging to the cohomology groups of smaller dimension 
p< |x| —1 and that so far in deriving our differential equation all these 
groups have remained unused because for every subsimplex . we used only 
the forms with the same and maximum dimension |.%?| — 1. 


Suppose now that we have a closed form w*:?~? corresponding to a 
nontrivial element of H?~?(CP% — W). Since dw?%:?~* = 0 we have 


w,7?-2'=0 
ds? - 


Algebraic Topology Methods and FRA 447 


By expressing the restrictions of w*:?~* on the boundary simplexes in terms 


of the basis set of forms w7’~'?~? we obtain an identity among the relevant 
amplitudes Aj‘. This identity? could be trivially void of content if the 
restriction of w/?~* to dS* would be an exact form, in which case we would 
get O= 0. 

This remark shows first of all that the final identity will depend on the 
cohomology class of the initial form only, and that therefore there are at most 
B*"~? identities. But it is also true that all these identities are actually 
independent and not void by virtue of the Lefschetz theorem. What we use 
here is actually an affine version of the Lefschetz result that 


H(CP"-!—U)=0 p>n-!1 (14) 


whenever U is a complex variety of complex codimension 1. We may state 
the affine version as° 


HCP"! —U,V—-VAU)=0 p#¥n-1 (15) 


provided the hypersurfaces U and V intersect transversally. It is not required 
that Uand V be regular—indeed they could be reducible. In our case we take 
U = Wand V =|); CP*~*. 

Before proceeding further, some justification of the affine theorem is 
not out of place. This is not directed toward the mathematicians in the 
audience—they have heard it already too often during the last weeks. 
Chances are, however, that this lecture will be read by people who were not 
present at the Rencontres. 

We begin with the usual exact sequence: 


H’-\(V —V AU) > H(CP"-!—U,V—VnU)> 
+ H(CP""!— U)4H(V—VoU)=> 


(16) 


The standard Lefschetz theorem states that: 
H’-'\(V—VonU)=0 H”(CP""'—U)=0 for p>n—1. 
Inserting these data into the sequence (16) we find 
H(CcP"™'—U,V—VnU)=0 p>n-t1 (17) 


We have reached the first half of (15) in a rather trivial fashion. We 
use now the fundamental duality theorem of Lefschetz (for cohomology) (see 
p. 142 of Topology by S. Lefschetz, Am. Math. Soc. Coll. Publ., Vol. XII, 
1930). 


2 This identity has been first obtained for the triangle graph by Dr. B. JakSic. 

3 R. Bott has just demolished by mail my unshakeable faith in the validity of (15) under 
the wildest hypothesis. Luckily enough the theorem is still true when we need it, that is, 
when U and V intersect nicely. 


448 TULLIO REGGE 


Fundamental Duality Theorem. Let K, L, L' be close complexes and 
K>L2>L', let K —Lbeamanifold. If L* = L — L' the duality theorem for 
absolute manifolds holds between the cohomology characters of (K — L)L? 
and that of (K — L?)/L’. 

We take K=CP,L=VuUU,L'=U. It follows that 


rank H?(CP""! — U, V —U a V) = rank H2"-"-2((CP""! — V) a U, U) 


If U and V intersect tranversally, we can thicken U q V and use the excision 
theorem, and we get finally at 


rank H?(CP""! — U, V— Ua V) = rank H7"-’-*(CP""1 —~V,U-—Un JV) 


(18) 
This yields the second half of (15). 
Thom and Steenrod have proposed other proofs of the same result and 
we plan to report them elsewhere. All these proofs assume that U and V 
intersect tranversally and in fact Bott has produced a counter example from a 
paper by Howards which also reports the affine Lefschetz theorem. In order 
to use (15) more efficiently we look again at (16) for p <n — 1, so that 


0 H(CP"-! — U) 2 HV —V 0 U) > Ht (CP""! —U,V—V aU) 


But the map p is just the restriction map, and by the exactness property p is 
an injection for p = n — 2 and an isomorphism for p <n — 2. It follows that 
exact forms only restrict into exact forms and this way we get always B?""~ 2 
meaningful identities. 

The same counting procedure could be repeated for each subsimplex 
S*<S*%. Each element of the cohomology group H!#!~2(CcP% — W*) 
brings in an identity on the amplitudes 4”~'. We expect therefore that the 
correct number of independent amplitudes will be: 

y (BY 1-1 _ Baisl=2) 
SoD 
This computation is again naive, for the existence of a nontrivial element of 
H?-"~> shows that not all the identities generated by elements of H*’"~? are 
actuallyindependent. A similar interpretation can be given for all elements of 
the lower dimensional cohomology groups H’? and the net result is that the 
effective dimension of N is given by the alternating sum: 


K= ) (atm) 


SoD \p=i1 


By introducing the Euler characteristics: 


A 
y(CP* —W*)= nO alas a 
p= 


Algebraic Topology Methods and FRA 449 
we see that K can be written as 


K = (—)'#!"y(P% — w*) 


By using now the Meyer-—Vietoris sequence and the usual relative sequence one 
arrives at the compact form: 


K= (cP? — W, () (CP?-'— we-)\(— ay 
This number is also the rank of 
H,-1(CP# —W, () (Cp? -* — we) 


Thus K is also the number of independent cycles of integration relative to the 
hyperplanes CP?~‘. In fact had we started from the functions: 


A,*(s, v) = [o.? ve H,-1( CPS -W,U (CPp?-'_— we-') 


we would have found the same set of differential equations for A,7(s, v) and 
the corresponding matrix N(s, v). It follows that the general integral of (13) 
is given by: 


K 
> AN(s, Va) 
a=1 
where the summation index a runs over all the elements of 
v,€ H,-1(CP? — W, () (CPp?-' — we) 
1 


In this way we have related the numerology of our differential equations to the 
corresponding numerology of homology theory. Of course my arguments 
here are at the best heuristic, but we hope to be able to formalize them suffi- 
ciently in detail as to make them rigorous. 

The formula given so far for K can be used immediately in connection 
with the published Euler characteristics for algebraic varieties in general 
position. We just borrow them from Hirzebruch. We have first that: 


y(CP")=n+1 or x(CP”)=|x| 


For a general hypersurface .o of degree / + 1 in CP” we have: 
1 
(WA) = ——— Jol (1+ 1) 1 + (-#" (19) 


By replacing these values in (17) we find that: 
K=(1+1)""' (20) 


450 TULLIO REGGE 


This result looks very neat and simple. Unfortunately as we said it applies 
only to fictitious FRA assoon as/> 1. The spinless box diagram is, however, 
included and for it we get K = 8.__In order to include higher order diagrams 
from real life we need to work even harder. This we shall discuss in Part V. 

The similar formula for the complete intersection W,*% A W,” of two 
general hypersurfaces of degree /+ 1, 4 + 1, respectively, is 


Z Pa ak ee. yyy) +1 
+1 
lotic ylp_ 
Pee RO Grrgepy re) 


It follows that if W = W, U W, we have 
eh ies (i+ 1)"—(h+ 1)" 
K= b+ 1)" '"%h + 1) = ——————_—_—_ 22 
2 ) ( ) (i+1)-—(h+1) 2 


For the triangle diagram / = 1, h = 0,n = 3, and K=7.* 


triangle diagram box diagram 


FIGURE 2 


PART IV 


FRA with /> 1 present difficulties and additional complications which 
increase with /. These difficulties may be roughly classified into the following 


types: 


* The complete analysis of the triangle diagram has been kindly communicated to 
me by Dr. B. JakSic. 


Algebraic Topology Methods and FRA 451 


1. If n—(W+1)/2 m<0O, W reduces into the two components 
W,: D(z,s)=0 and W,:d(z)=0. If W, and W, were regular and in 
general position and of degrees / + 1 and /, this case could be solved explicitly 
by the use of the Meyer—Vietoris sequence and K is given by (22). 

2. W, or some of the components of W if (1) holds, is not regular. In 
this case the general treatment follows the lines sketched in the previous parts 
of the lecture. The main difficulty is that there are practically no available 
formulas for the Euler characteristics of nonregular hypersurfaces and in fact 
there is no classification of nonregular algebraic hypersurfaces with the 
exception of the lower dimensions and even there it is extremely complicated. 
This does not mean that in any specific case some ingenuity will not reach the 
final result. 

3. The different components of W are not in general position with 
respect to each other. This may happen in concurrence with (2), and to make 
things worse there are simple diagrams like the envelope (m = 4), where the 
locus of singularities of one of the components of W lies entirely on the other 
(Fig. 3). Here for the moment I am stuck. 


FIGURE 3 


4. In addition to (1) to (3) W will contain the class of subsimplexes 
(hereafter named singular simplexes) of S*% obtained by putting z; = 0, where 
ie Y is the set of lines associated with a cycle of Z as described before in the 
definition of d(z). In order to describe these sets more efficiently we label 
them with the complement of Y and we use end of the alphabet letters so that 
for instance CP” is the variety z,=0 i€ 2, with S*? the corresponding 
simplex, and it is understood that S? and CP? lieon W. If W is reducible 
then S*, CP? lie on both components and therefore on their intersection. If 


452 TULLIO REGGE 


/ = 1 all singular simplexes are disjoint for on their intersection all z; would 
have to vanish. If /> 1 this is no longer true and we may consider a whole 
family of singular simplexes forming a partially ordered set under the inclusion 
relation. To this purpose we introduce a level function on the family of 
simplexes as follows. Let S®* be a singular simplex and take the complex 


Q-Y, that is, the closure of Z-Y. Define 
v(S”) = rank H,(2-Y) 


v(S”) is essentially the number of loops on which the z; vanish. Evidently 
1 <v </and also the smaller is v, the larger the corresponding simplex. If 
v = 1 then the simplex 1s not contained in any other simplex and we call it 
maximal. Any singular simplex of level v can be represented as the inter- 
section of v maximal simplexes. We see therefore that as soon as we reach 
/> 1, we encounter very complicated structures. It can be also verified that 
on a singular simplex of level v both functions d(z), D(z, s) and their deriva- 
tives up to and including the order v— 1 vanish. Therefore on the levels 
v > 1 Ws singular and if reducible both its components are singular. 

It is reasonable therefore to try first to understand the case / = 1 where 
all singular simplexes are disjoint. Should there be several singular simplexes 
we can easily deal with them once we know how to deal with one simplex S® 
only; hence, we shall limit ourselves to this case. The usual approach, as we 
find it in the current literature on the subject, is to blow up along the linear 
variety CP® containing S®. This amounts to the construction of a new pair 
of algebraic varieties CP? > CP” and of an analytical map f such that: 


CP? «= 1CPR? 
f | f | (23) 
CP? — CP® 


and such that f]| CP? — CP® reduces to an isomorphism, and CP” is of codi- 
mension 1. We use also the notation V for the strict blowing up of any set 
V defined as 


V=f'V—-—VacCP”) 


May I just remind those who are not familiar with these methods how 
the blow-up is actually achieved. Let $” be the ideal of homogeneous poly- 
nomials vanishing on CP® and Y ,” the subset of ¥* of polynomials of degree 
p. Select a basis of independent polynomials in J,” : P(z), i, 1---L. Con- 
sider the complex projective space CP“~' in the homogeneous coordinates 
¢, °°: ¢, and in it the algebraic variety of parametric representation: 


(,=P(z) ister L 


Algebraic Topology Methods and FRA 453 


For sufficiently high p it can be shown that this variety is a model for CP” 
and that for larger values of p all models one obtains in this way are iso- 
morphic. One could choose other ideals contained in %* and one would 
achieve in this way different types of blow-ups. I think that at the end we 
shall consider all sorts of weird blow-ups in order to cope with the just a 
weird collection of singularities offered by the FRA. But for the moment it is 
enough to work out just the simplest case. 

We come back to the motivations behind the blowing-up; they essen- 
tially reduce to the fact that on CP?, §?, and W do not intersect. S$? is no 
longer a simplex, however, and it will have a new“ face”’ where CP® has been 
blown up into codimension 1. Since CP® is a manifold we may introduce 
local coordinates and carry out the integration of the FRA in these coordi- 
nates. By introducing these new coordinates the original differential form w 
becomes a differential form @. There are then two possibilities: 

(A) The form @ is singular on W only. In this case we deal with a 
theory exactly similar to the one worked out so far, the domain of integration 
§ does not intersect the singularity, the integral converges and we may repeat 
our derivation of the differential equation with the enly difference that now we 
work in CP” which is not a complex projective space. 

(B) @ is singular on the complete blow-up of W, that is, f~!|W]| which 
is just Wu CP”. In this case a whole face of §® lies on a hypersurface 
CP” of singularities and the integral diverges. In fact there is no need to 
blow up in order to see that it diverges. If we look at the original form in 
CP? we see immediately that this corresponds to an integral where the power 
of the singularity in the denominator is too high for convergence. We see 
that convergence is translated by the blowing-up procedure into absence of 
singularities on CP*. The nice thing about blowing up is that now we have 
again a proper homology group we can talk about (in case A); in fact, 


H,(CP? — W,\) (CP?-'— W?- (CP? — CP® a W)) (24) 


Here we can use compact chains of integration. Before, since the simplex 
S? was reaching W we could not use compact homology and all the relevant 
machinery. This is the reason why people working on FRA from the homo- 
logical point of view are more or less forced to blowup. In this lecture, how- 
ever, I have more emphasized the cohomology side and I think there is a nice 
way to have all the advantages of blow-up without the disadvantages: among 
these the need to use methods of algebraic geometry which are far removed 
from the daily life of the ordinary physicists. The idea here is contained in 
my previous remark that cohomology in CP? — W is essentially the con- 
vergent cohomology in CP? — W. Convergent here means just the property 
of forms to converge when integrated over the subsimplexes of S? of the right 
dimension. In order to achieve convergence the power of the denominator 


454 TULLIO REGGE 


in the form must not be too high. Lascoux and I have examined in detail the 
conditions of convergence of aform. The interesting part of it is that these 
conditions can be best described by means of the idea of filtration. 

We may state it shortly as follows: To each group of formsC (CP? — W) 
we associate a subgroup of convergent forms C,?(CP? — W) with the 
property that 


6C,°(CP" — W) < C?*"(CP" — W) 


The restriction of 0 to the convergent forms has then all the properties of a 
boundary operator and we can construct a convergent cohomology out of it. 
From the previous discussion this cohomology will be isomorphic to 
H*(CP’ — W). The groups C,’ are commonly referred to as filtered groups. 
(In general one considers a sequence of nested subgroups C > C,? > C,?:::C,? 
such that dC,’ < C?*'; this generalization will certainly be needed when 
several disjoint varieties are blown up.) 

What Lascoux and I tried and up to a large extent succeeded in doing is 
to characterize the filtered cohomology without recourse to a specific and 
explicit blow-up procedure. At the same time we found an exact sequence, 
similar to Leray’s sequence, which ties up together the ordinary, the filtered 
cohomology and the cohomology of CP® — CP® ~ W. This we do in the 
following way. We introduce a new set of variables, €;, p: 


z=€,iEe z,;= pg; ,i€Y Zy = {z;, i€ GY} etc. 
where for convenience we label Y as {1---¢}. By differentiating we find 
dz;=dé,,i<t dz;=pdé,;+¢,dp,i>t 
We now replace dz, with their expression in the right-hand side in the defini- 


tions of mw, andy,,. We find 


dp 
Wy = pla (os! + re A Ona A Bang — 1! } (25) 


ny’ dp ny’ 
Neg = plan” ( “IF me A tang Nang — 190% } 


where 
Og = /\ dé; etc. 
ie oS 
Similarly the projective form 
wrt = Rana 
|s|=p 


is replaced by the form 


7 ney’ dp nw 
wr! = ¥) ReSa, Pear )piw (na! +P A tena A Mana e" } 


Algebraic Topology Methods and FRA 455 


The point p = 0 evidently corresponds to z;=0, ie Y’, that is, CP®. In 
order to see more clearly what happens in the neighborhood of CP® we ex- 
pand the coefficients of w as 


Rega, poy) = » RO (Ey ) Eg )pt fowl 
so that 
wo! = 2 p* Z RO (Eg ) Ca Neg? a a ae dp A 3 RO tne A ning —)'4°*" 


Our definition of filtered form is that all the terms with negative powers in the 
above expansion should vanish: 
2 REx, oan =0 k<0 


ny’ (26) 
> Ry ng A Nyng(—-DiVo*"l=0 k<0 


We can give here just a few words of explanation as to why these con- 
ditions are in fact a filtration. We introduce the auxiliary operators /,: 


A,@ = d RO(z)ts 


These operators are easily seen to commute with d and z. On affine forms 
we introduce also the partial z operators zy Zy-, aS 


Lay’ 2 M gg = 2 M pany \ Nang(—)e" 
Zy 2 M gD = Zy d M gO gny \ Veg ny! = 2 M Nanny Magny (27) 
We find after some algebra that 
(Za)? = (zy)? = 0 (28) 
Zy Zy + ZayZy = 9 


But in general it is not true that dz, + zy d= 0 unless we restrict ourselves to 
the image of A,. It follows that filtration conditions can be written as 


A,0=0 k<0 > zygd4Awma=0 k<0O0 


ZyA,W = —ZyA,w=0 
We deduce immediately that 
di,w =i, dw =0 
AZyAyW = Zy dAypW = ZyAy dw = 0 


It follows also that dw is a filtered form. 


456 TULLIO REGGE 


In order to arrive at the Leray-type sequence we need some more defini- 
tions. We call byprojective the projective forms of the kind: 


T= 2 Rg (Z)Nstnw A yng (—) ">" 
=p 


where R is separately homogeneous of degree —|~%@ A Y| and —|Pan Y’| in 
Zy and zy, respectively. Clearly, zyA gw is a biprojective form. The 
boundary of a biprojective form is again biprojective. We can repeat all the 
cohomology theory for biprojective forms just as we did for ordinary cohomo- 
logy. We should think of biprojective forms as differential forms on the 
blowing up CP” of CP”. It is well-known that CP” has the structure of a 
fiber space with base CP” and fiber CP”. We expect forms on CP” to be 
written locally as a direct product of forms on CP” and CP” and this brings 
us to biprojective forms. The operator A, gives us a map: 


C?-'(CP* — CP® a W)— CCP? — W) 


A, maps boundaries into boundaries; therefore it generates a map between the 
corresponding cohomology groups: 


H?(CP? — W)*s H®~'(CP® — CP” a W) 
We call A) and A4)* the residue operators. We have also the injection 
C'(CP* — W) 4 CCP? — Wu CP”) = CCP? — W) 
Also this map generates a map 
H°(CP? — W) HCP? — W) 


It is evident that A*/* = 0 but it takes much more work to show that in fact the 
sequence: 


H(CP — W)H*(CP — W)*5 H?-'(CP® — CP” a W) 


is exact at HCP — W). We skip the proof here. 
Finally, suppose that we write the equation of Was F(zy, zz) = 0 where 
F= D(z, s) or d(z)D(z). The leading term F,(za, zy.) in the expansion: 


F(2y, PZa) = dpe F(zy , Zar) 


is the equation of Wm CP”. Suppose now that a closed biprojective form 
t?~! be given, singular on Wan CY”: 
qP-! P (Za ’ Za) dia 


= — A (— 
41 Fo(Zy, Zy)" Netny Nang’ 


Algebraic Topology Methods and FRA 457 


We call a projective form y > 0 an extension of t if it satisfies the following 
conditions: 
Ay pP = 0 k < 0 
ZyAgw? =t  pregularonCP® — W 


Extensions of a given form t can always be found and there actually are in- 
finitely many of them; one example is 


Pg 
P— _—_— 


Once an extension p of t is found, we take its boundary. It can then be 
verified that du? is a filtered form and that its filtered cohomology class 
depends only on the class of t?~'. It follows that we have a map 


H?~"\(CB® — CB® a W) > H?- (CP? — W) 
It can be checked that the triangle of maps 
H*(CP? — W) 4 H*(CP — W) 
v* A* (29) 
H*(CP® — CP® AW) 


is everywhere exact. I like to see this sequence as a sort of affine Leray 
sequence. The usual Leray sequence reads 


HX) + H’(X — U) 
p-1— p-2 (30) 
\ 
H?-*(U) 
where X, U are regular algebraic varieties. 
Leray sequence is identical with ours provided one chooses X = 
CP? —-W, U=CP* —CP* AW. Atthe moment I do not know whether 
the Leray sequence automatically applies to noncompact manifolds; if so, 


then (29) is really nothing new. What is important anyway Is the relation 
among the different Euler characteristics which one gets out of (29): 


y(CP? — W)= y(CP® — W) + x(CP® — CP® a W) 


In the case / = 1 the group H*(CP® — CP” nm W) is easy to handle because 
the fiber becomes trivial. The reason for this is that it can be shown that 
in the neighborhood of the power expansion of d and D reads: 


d(z) = ( > z)da(eo) 5 


D(z, s) = » z; Dy(Za) 


458 TULLIO REGGE 


where dy and Dy are the same functions defined for a diagram where all the 
lines in Y’ have been contracted and their boundary points identified. This 
is indeed just the diagram D/%’. Let W® be represented by d®D*® = 0 or 
D” in CP® and let CP,” be the hyperplane 

y z;=0 

iey’ 
It follows that locally Wa CP” is just the product of CP” — W® and of 
CP” —~CP,”. But CP® — CP,” is just affine space and is retractable into 
a point. Therefore as far as cohomology is concerned we may replace 
everywhere CP® — CP® A W with CP® — W®. The theory of CP” — WwW” 
is just the theory of the contradicted diagram D/%’. It is striking that the 
same diagrams appear in renormalization theory as exemplified by the last 
monumental paper by Hepp? on the subject. I do not think that this coinci- 
dence is accidental. I am instead of the opinion that this way we shall 
ultimately reach a neater formulation of renormalization through algebraic 
topological methods. Ihave made no attempt so far to deal with the cases (7) 
but I feel that they will eventually be discussed with a generalization of the 
present methods. 

I hope I have been able to summarize our present attempts and attitudes 
toward the whole subject of FRA in a way which should be understandable 
to both mathematicians and physicists in the spirit of these Rencontres. In 
spite of all these dedicated efforts by a lot of theoreticians, most of the 
diagrams have resisted any attempt to subjugate them by the sheer amount of 
the computations involved. I hope, however, that enough of them will be 
studied in detail as to give some hint as to what the general structure may be. 
Till that moment we cannot expect any reasonable application of the present 
techniques to real-life physics. 


ACKNOWLEDGMENTS 


This work has been performed mainly at the Institute for Advanced 
Study in Princeton and at the Istituto di Fisica dell’Universita di Torino 
under E.O.A.R. Grant 66-29. I am particularly indebted -here at the 
Rencontres to Drs. R. Bott, H. Hironaka, R. Thom, and N. Steenrod for 
many interesting discussions. I am also indebted to Dr. V. de Alfaro and 
M. B. JakSi¢ whose results in this field have been crucial in deriving many 
features of the present work. 


5K. Hepp, Comm. in Math. Physics, 2, 301 (1966). 


XVII 


The Use of Padé Approximations 
in Particle Physics 


MARCEL FROISSART 


For work on Padé approximants, see Preprint CERN TH794, 
July 5, 1967, “‘ Unitary Padé Approximants in Strong Coupling 
Field Theory and Application to the Calculation of the p and f° 
Meson Regge Trajectories’ by D. Bessis and M. Pusterla. 


459 


XVIII 


Topics in Topology and 
Differential Geometry 


RAOUL BOTT AND JOHN MATHER* 


1 The Homotopy Category 461 

Theorems in Linear Algebra 461 

Finite-Dimensional Simplicial Complexes (Polyhedra) 463 
Building of Simplicial Complexes 465 

Morse Theory 467 

Homotopy Groups 480 

The Classical Groups 492 

Cohomology 500 

Vector Bundles and Characteristic Classes 506 


Nn bh WN 


In these lectures I had hoped to take my friends from physics on a few 
guite informal hikes into the topological countryside. Coming as I did after 
lectures on relativity and symmetrics, I fell back on an old love of mine and 
planned my first excursion around the Morse theory and the elementary 
homotopy properties of sphere and the classical groups. As it turned out, there 
was very little time left after this expedition. I mention this here in order to 
warn the physicist that I haven't in any way covered the field, and that many 
other excursions—for instance, into differential topology, cobordism, semi- 
simplicial theory, sheaf theory—might be of equal and possibly even greater 
interest to him. 

The notes presented here are by and large the ones taken by John Mather 
during the lectures. He did, fortunately, cut out much of the hot air, and I hope 
all the wrong statements. Obviously the notes could have been polished quite 
a bit; however, even if we had wanted to, the ruthless timetable of our charming 
editor would have made that impossible. 

Raoul Bott 


* Lectures by Raoul Bott and notes by John Mather. 


460 


Topology and Differential Geometry 461 


1 THE HOMOTOPY CATEGORY 


We begin by defining a category. Its objects are finite-dimensional 
vector spaces over R. Given two finite-dimensional vector spaces V and W, 
the notion of a linear map ¢: V— Wiis defined. We set Hom(V, W) = the 
set of all linear maps from V to W. 

The set of linear map satisfies two very trivial axioms. 


1. For each V, there exists a unique mapping 1,: V—V called the 


identity. 
2. Given ¢, w as follows 


¢ y 
U-V—-WwW 
we can form the composition yo @: U- W. 
This composition law is associative: 


Po (WoAl=(PoW)od 


lpop=o 
Poly=o 


In general a category C consists of: a class of objects, and for any 
objects A and Bin C, a set Hom (A, B) called the maps from A to B. Further, 
there is given for every triple of objects A, B, Cin C a set mapping Hom (A, B) 
x Hom (B, C) > Hom (A, C) (abstract form of composition), and a singled- 
out element 1, in Hom (A, A) such that composition is associative and 1, 
acts as a unit. 


1 acts as a unit: 


Theorems in Linear Algebra 


Classification Theorem. We say two spaces V and W are isomorphic if there 
exist mappings @, w 
¢ 
V2w 
y 
such thatw.@=landg@ow=1. Inthiscase wecall W and ¢ isomorphisms. 


We write V ~ W. 


Theorem. Every V=~R'@®:---@R!'=R" and R"tR" if n#¥m. 
ee 


Example. The category of finitely generated Abelian groups. Hom (4A, B) 
= the group of homomorphisms from 4 to B. 


462 RAOUL BOTT AND JOHN MATHER 


Theorem. Every finitely generated Abelian group is of the form 
A ~ZOZQ°:'@®ZOZ,m ® Zy.n. ®B°** © Zp n, 
ee Ear eer 4 


n 


The Category Hot. This is the category in which homotopy theory operates. 
Objects are topological spaces. 


Definition of Topological Space. \t consists of the following data: A set_X, 
and a family of subsets 0, called open sets, subject to three axioms. 


1. © is closed under union. 

2. © is closed under finite intersection. 
3. @ (the empty set) and X are in 0. 
Given 


Sf 
xXx—Y 


a set function, fis said to be a continuous map if f~'(U) is open when U is 
open in Y. 

We could now speak of Hom (X, Y) as the set of continuous maps from 
X to Y. Two isomorphic objects in this category are called homeomorphic. 
This is a very difficult category. Very little is known about the homeomor- 


phism problem in general. 
There is a more sophisticated category, the homotopy category. 


Definition. If there exists a continuous family f,; X¥— Y, te /= [0, 1] then 
f, 1s called homotopic to f,. More precisely, consider the following diagram. 


x > (x, 1) F 
—___ > xX x IY 
x — (x, 0) 


if F is continuous, f= Fo/ (I(x) = <x, 0D), and g = Fou (u(x) = <x, 1)), 
then we say fis homotopic to g. 


Mild Theorem. (1) Homotopy is an equivalence relation. This means f ~ f, 
iff ~g, theng~f; and iff~g andg~h, then f~h (where ~ means “‘is 
homotopic to’’). 
(2) f~fig~gefog~fiog 
The upshot is that we can make a new category whose maps are homo- 
topy classes of maps (i.e., equivalence classes under the equivalence relation 
of ‘“‘homotopic to”). Thus, we define 


Hot (1) Objects are topological spaces. 
(2) Hom (X, Y) are homotopy classes of maps. 


Topology and Differential Geometry 463 


In the homotopy category there is also a structure theorem that says 
that everything is the direct product of simple things; however, this is a twisted 
direct product. 


Examples. Reasonable spaces. 


Finite-Dimensional Simplicial Complexes (Polyhedra) 


They look like, for example, Fig. 1. 


FIGURE 1 


(1) Let x,,...,x, be points of R” in general position. Then the 
simplex spanned by these, <%9,..., X,», 1S the set 


{x =) A,x;: 4, >0, ¥ A; = 1}. 


The closure of this is a union of simplices, which are called the faces of 
the simplex. 

The simplex t spanned by any subset of the vertices of o is called a face 
of o. 

A polyhedron K < R™ is a finite union of simplices satisfying two condi- 
tions: (1) If o is in K, then all the faces of o are in K; (2) if o and t intersect, 
then they are the same. Consider the two complexes in Fig. 2. They are 
different in the topological category (i.e., nonhomeomorphic) but are iso- 
morphic in the homotopy category; that is, there exist mappings 


f 
ra fog~l gof~l 


464 RAOUL BOTT AND JOHN MATHER 


FIGURE 2 


We may take the inclusion for f, and the mapping that is the identity on X 
and maps o” toa pointforg. Another example of spaces that are isomorphic 
in the homotopy category is the disk in two-space and a point. A tree like 
the one in Fig. 3 has the same homotopy type as a point (i.e., it is isomorphic 
to a point in the homotopy category). 


FIGURE 3 


Problem. Which of the complexes in Fig. 4 have the same homotopy type? 
As one might expect, the circle is not homotopic to a point. 

Why do we study the homotopy category? The homeomorphism 
category seems bad enough, allowing all sorts of deformation. However, 


Topology and Differential Geometry 465 


CxO 


FIGURE 4 


mathematicians discovered that the invariants used to distinguish homeo- 
morphism types mostly distinguished homotopy types also. 


Building of Simplicial Complexes 


Consider simplices of dim <N. Write K’ = K —o", where o” is an 
n-simplex. The idea Is to take out o”, but remember how it was glued. That 
is, we are given o, and a map /: 6, > K’ (6, denotes the boundary of a,,) as in 


LV /\ 


FIGURE 5 


466 RAOUL BOTT AND JOHN MATHER 


Define K = K’ disjoint union é,. Define an equivalence relation in RK 
by identifying x € &, with f(x) in K’. Set 


K= K/~ (where ~ denotes the equivalence relation) 


In this case it is very easy to visualize the quotient space, since / 1s 
a homeomorphism onto a subset. However, we can make this construction 
without assuming fis a homeomorphism. 

Given X, e, = {|x|<1 in |R"} and a map a:é,>X. Here é,= 
{|x| |i y,7 =1}. Then define Y= X U, e, as follows: 


Y= X disjoint union e, 


We Say a point x € é, is equivalent with a(x). There results a quotient 
set Y/~ and a natural function x: YoY/~. Define Y as the topological 
space with underlying set Y/~, where an open set Uc Y is one whose pre- 
image 2~1(U) is open in Y. 

We are going to talk about the homotopy category whose objects are 
spaces that can be built up in this way. Note that the homotopy type of Y 
does not depend on « but only on its homotopy class. 


Proposition 1. If « is homotopic to a’, then X \),e" has the same homotopy 
typeas X \),,e". 

The proof presents purely technical difficulties. It would be a good 
exercise for the reader. 


PROOF. Let a,: é, > X(te 7 = [0, 1]) be a homotopy connecting « and a’. 
(Thus (u, t) > «,(u) is a continuous mapping of é, x J into X, «) =a, and 
a, =a’.) Let r(x) denote the distance of x eR" from the origin. Take 
e, = {x: r(x) <1}. Let y denote the radial projection of e, — {0} on é, 
(defined by y(x) = x/r). Define f: X¥ U, e, ~ XU, e, by 


Fix) =x xeX 
FU) = %a-ruy VU)  uvee,, ru) 24 
f(u) = 2u uee,, ru) <4 


We leave it to the reader to verify that this indeed defines a continuous map. 
Define g: XU, e, ~ X VU. e, symmetrically. That is 


g(x) =x xEex 
g(u) =a OX r(u)— i(y(u)) ue Cn» r(u) 2 4 
g(u) = 2u uee,,r(u) <4 


One can show that fg ~ 1 and gf ~ 1 by constructing explicit homotopies. 
We construct a homotopy / connecting fg and 1, and leave the construction 


Topology and Differential Geometry 467 


of the other homotopy to the reader. For te J, set 


h(x) = x xEex 

AU) = 4201 -now-1ypQ)) = wee, ru) 24 
= 1 —nany-1QM)) = wee, 8 Kru) <4 
=u+3(1—)u uee,,r(u)<4 


We leave the verification that 4 is a continuous map of (X U,-e,) x J into 
XU, e, and that ho = fg, h,= 1 to the reader. 


2 MORSE THEORY 


Our program is to decompose an arbitrary differentiable manifold in 
the manner described above by means of Morse theory. 


Definition of a C” manifold. A topological manifold is a topological space M 
that is Hausdorff and has a countable base for its topology, and that is locally 
Euclidean in the sense that every point has a neighborhood homeomorphic to 
Euclidean space. 


A C® coordinate cover for M is a cover {U,} of M, together with a 
homeomorphism @¢,: U, > V, of each U, onto an open set V, in R” such that 
the maps 


dpb. :$A(U, ™ U;)—> Va -™ Vp 


are of class C” as maps of R” into R”. 

Two coordinate covers {U,} and {U,'} are equivalent if and only if 
{U,, U,'} is again a C® coordinate cover. 

A topological manifold M together with an equivalence class of C” 
coordinate covers is a C™ structure on M. 

If M is a C® manifold and f: M— R, then / is called C® if f. o,° is 
C® for all U, in some coordinate chart. 

A level surface of fis a set f~'(a) for some ae R. In Morse theory, we 
consider M, = {u| f(u) <a}. 


Definition. A point pe M is critical for f if (éf/éx;)|, =9 in some local 
coordinate system (x,,...,X,) at p. The images under / of critical points 
are the critical values of f. 

(A local coordinate system (x,,...,X,) at p is a set of functions 
E, o Par +++5&n 0 Py Where €,,..., 6, are the standard coordinates for R" and 
U, and @¢, are as above. These functions are defined, of course, in U,.) 


468 RAOUL BOTT AND JOHN MATHER 


Example 


Consider a torus in x, y, z space with its axis parallel to the y axis. 
This is illustrated in Fig. 6, where we imagine the y axis as being perpendicular 
to the page. Take fto be coordinate z. Then // has four critical points, at 
a, b,c, and din the diagram. The critical values are all distinct. 


FIGURE 6 


Theorem I. Let M be a compact C® manifold. Let there be no critical values 
of f in the interval [a, b]. Then M, =~ M, (homotopy equivalence). 


PROOF. Construct a vector field ¥ on M for which Xf > 0 except at critical 
points. For example, choose a Riemannian structure on M. The dual of 
df is a vector field Vf = X with the required property. (This is the gradient 
of f. It is defined relative to the Riemannian structure.) 
(We think of vector fields as differentiable operators on functions.) 
Our definition of the gradient X of fis equivalent to 


(X,, ep, = df,(Y,) 


holding for all vector fields Y on M. 

Define a vector field X’ on f~*[a, b] by X' = —X/Xf. Then X’f= —1 
onf~'[a,b]. Extend X’ toa C® vector field on M. We denote the extended 
vector field by the same letter. Let g, (t € R) denote the one-parameter group 
generated by X’. That is, assume g, is a diffeomorphism of M into itself for 
each te R; that gy = 1; and that 


0 / 
5 90) = Xoecey xeM,teR 


Topology and Differential Geometry 469 


(The existence of this one-parameter group follows from standard theorems 
in differential equations. From the conditions that we have imposed, it 
follows that 9,4,=9,9;, whence the terminology one-parameter group.) 
From 


0 
at GON = ANG) =—-1 if g(x)eM > — M* 


it follows that f(g,(x)) becomes <a for some teb—a. Then f(g,(x)) stays 
<a for larger t since XY points into M, along the boundary of M°. Hence 
Js-. Maps M° into M*. Similarly, one shows that g,_, maps M“ into M°. 
Since g,_-, =(g9,-,) ', this shows that g,_, maps M° diffeomorphically onto 
M°’. Then, of course, g,_, is a homotopy equivalence. 


Theorem 2. Suppose that there is exactly one critical value in the interval 
[a, b], say c, and exactly one critical point on the level c. Further assume that 
p is a nondegenerate critical point; that is, that the Hessian matrix (6f /0x 0x ;) 
is nonsingular at p. Then M,~ M, \) e,, where k is the number of negative 
eigenvalues of the Hessian. 

Now consider the example of the torus in x, y, z space, as above. The 
critical points are nondegenerate and the number of negative eigenvalues at 
each of the critical points is 0 at a, 1 atb, 1 atc,2atd. Thus, by Theorem 2, 


S'x S'~e,UYe,UYeUYe 
a a B 


This is the same as the decomposition that we described in the previous lecture. 

Another example is complex projective space CP,. Let (Hz, z) be a 
Hermitian form in n+ 1 variables. Set f(z) = (Hz, z)/(z, z). Then fis a 
C®” function on CP". The critical points are the eigendirections. If the 
eigenvalues are distinct, then all the critical points of fare nondegenerate and 
they have distinct eigenvalues, so Theorem 2 applies. Letd,; <1, <°:::<A,4; 
be the eigenvalues. Then the number of negative eigenvalues in the Hessian 
are O for d,,2forA,,4ford,,...,2nford,. Thus Theorem 2 shows CP” ~ 
pt-U_e2 Us ea U'**Ue2,. However, the theorem gives no information 
as to what the maps a, £, ... are. 


PROOF OF THEOREM 2. We need the following 


Morse Lemma. If p is a nondegenerate critical point of f, then there exists a 
coordinate system X,,...,X, near p such that 


= 2 2 Ces 2 2 we 2 
f=c-—xy°—x2°- — Xe + Xa + nee Fe 


in the corresponding coordinate neighborhood. (In any such coordinate system 


x1(p) = +++ = x,(p) = 0.) 


470 RAOUL BOTT AND JOHN MATHER 


The Morse lemma may be proved by the method used to diagonalize 
quadratic forms. (See Milnor, ‘‘Morse Theory” Annals of Math. Studies 
No. 51, p. 6, Princeton University Press, Princeton, N.J.) 

To prove Theorem 2, we choose such a coordinate system near p. We 
set 

M°** = {q:f(q) <c 
M*~* = {q:f(q) <c — 8} 
C = {q:f(q) <c+eand x,(p)* +--+ x,(p)? < 5} 


where 6 is a small positive number, say <e”. In Fig. 7 we have illustrated 
these sets in the special case that we described before. Figure 7a shows what 


FIGURE 7a 


WW 


CS 


FIGURE 7b 


Topology and Differential Geometry 47] 


these sets look like in the torus, and Fig. 7b shows their intersection with a 
suitable coordinate neighborhood (i.e., one given by the Morse lemma). 
Set 


B= M‘**—-C. 
In our special case, the intersection of B with the coordinate neighborhood 
looks like Fig. 8. From the picture it appears that B can be contracted onto 


M‘~* by moving along the vector field Vf. This is the case, provided we 
choose the Riemannian metric correctly, as we now show in complete gener- 


ti 


a 


| 


FIGURE 8 


We choose the Riemannian metric so that inside the coordinate neigh- 
borhood it is dx,? +-::+dx,”. We set A= B— M°~§, and take Y=Vf 
and X’= —X/Xf. The latter is defined everywhere except at the critical 
points of f. In particular, it is defined in a neighborhood of A. For any 
x € A, we let t g(t, x) denote the maximal integral curve of X’ through x, 
that is, the maximal curve satisfying g(0, x) =x and 


We claim that there exists f(x) such that g(to(x), x) € M°~*, and such 
that g(t, x)¢ A for O<t<t,(x). This may be shown by eliminating the 
other possibilities. These are the following. 

(a) g(t, x) is defined only for 0 <t <4, for suitable ¢,, and g(t, x) € A 
for all such t. This is excluded by the fact that A 1s compact. 


472 RAOUL BOTT AND JOHN MATHER 


(b) g(t, x) is defined for all positive ¢, and g(t, x) € A. This is excluded 
by 
ot(g(t, x)) 
Ot 


which implies /(g(2e, x)) = /(x) — 2e < ¢ — €, so that g(2e, x) ¢ A. 

(c) g(t, x) crosses the ‘“‘upper’’ component of the boundary of 4; that 
is, (M°*®—C)U(CAA). This is excluded by the fact that X’ points into 
A along this component of the boundary of A. We leave it to the reader to 


= Xitot = = (*) 


verify this. 
Define 1,: B— M°~*, for t > 0, as follows. 
h(x) =x if xeM‘c§ 
=g(to,x) if xeAandt>f% 
= g(t, x) if xeAandt<%y 


Then /i, is a definition retraction of B into M°~* (1.e., lig(x) = x, h,(x) = x if 
xe M**,andh,(x)e M°* if t> 2e, by (*)). It follows that B has the same 
homotopy type as M°"*. 

Now M‘t*=BuC. From the fact that the pair (C, C7 B) has the 
same homotopy type as the pair (e*, é“) it follows that 


MS**§ x M°-* Ve, (homotopy equivalence) 
a 


By Theorem 1, M°** = M’ and M‘°~* = M‘%, which completes the proof. 

As an example, we sketch the proof that complex projective n-space, 
CP,,, can be built up from any nonsingular hypersurface X by attaching cells 
ofdim>n. This result greatly simplifies the computation of the homology 
of such a hypersurface, for it implies that the inclusion induces isomorphisms 
HX) ~ H(CP,) and H'(CP,) + H'(X) forr<n-—1. Then the homology 
H,(X) may be determined using Poincaré duality for allr An —-— 1. In fact, 
we obtain H,(X) = Z for all even r #4 — 1 and H,(X) =0 for all odd r# 
n—l. 

We recall that CP, is defined as the quotient of C"*' — {0} by the 
equivalence relation z ~ Az for all ze C"*' — {0}, Ae C — {0}. Weconsider 
a polynomial p in variables z,),..., z, that is homogeneous of order k. The 
equation p = 0 defines a set X in CP,,, since p(z) = 0 implies p(Az) = 0. Such 
a set is Called a hypersurface. It is called nonsingular if there is no common 
solution (in CP,) of the equations 


= ee ae 
poz, Oz, 
Set 
2 
z 
f(z) = )I 


Topology and Differential Geometry 473 


This is homogeneous of degree 0, so defines a function fon CP". We claim 
that the Hessian of f at every critical point not on X has at least n negative 
eigenvalues. Assuming this, it is easy to see that CP” can be built up from Y 
by attaching cells of index >n. For it is a theorem that fcan be approxima- 
ted in the C? topology by a function g that has only nondegenerate critical 
points on CP" — X and that equals f near X¥. (In the case of a compact 
domain, approximation in the C? topology means uniform approximation of 
the values and first two derivatives of the function.) If g is close enough to 
fin the C’ topology, then the Hessian of g has at least 7 negative eigenvalues 
at every critical point not on X. 
Thus, for ¢ > 0 small, 


CP" ~ CP® U) ef) U e!?) ) -2+ oll) 
a B 7 


where each e“ is a cell of dimension >n and CP* = {z € CP": g(z) < e} = 
{ze CP": f(z) <c}. It is easily seen that CP’ ~ X. In fact, one can con- 
struct a deformation retraction of CP into X by following the gradients of f- 
Thus 


CP"~ XU EP UU? 
a B 


The proof that the Hessian of f has at least n negative eigenvalues at 
every critical point not on X goes as follows. Let A” denote the vector space 
of C® complex-valued r forms on CP". Then there is a canonical splitting 


A’ = ANP ° BATH @®:---@AWr 


that can be described as follows: Let w, =u, + iv, ..., W, =U, + iv, be 
holomorphic local coordinates for CP". Let dw; = du; +i dv; and dw; = 
du; —idv;. Then A’~“* consists of the r forms that may locally be expressed 
as sums of expressions of the form 


Uu dWict) | aaa 1 dWip—k) A dW 1) Loreen LS AW xx) 


where uw is a C™ function. 

It can be shown that for any ve A’~**, we have dve AT Kt! * + 
At~**+1_ One denotes the A’~**':* component of dv by d’v and the 
A’~***! component of dv by d’v. If u is a C® function, one has locally 


GF) 
d' mgs 7 
d’ => az, 


where 60/0z; = 4(0/0x; — i 0/dy;) and 0/0z; = 4(0/0x; + i 0/dy;). Also, from 
d? = 0, one obtains (d’) = (d”)? = 0 and d‘d” = —d"d". 


474 RAOUL BOTT AND JOHN MATHER 


Now one computes a formula for d‘d"f by computing d’d” log f in two 
different ways. Using the definition of /, we get 


d'd” log f = d’d" log p + d‘d” log p — md‘d" log ||z||? 


Since log p is holomorphic, d” log p = 0, so the first term vanishes. Likewise 
d’ log p = 0, so the second term is zero, by the anticommutativity of d’ and 


d”. Thus, 
d'd” log f = —m d’a’ log |\z||? 


But, 
d"f\ d'd’f dfad'f 
d'd” log f = a'(=) ae AR ee ate 
ENG ae ge 
so that 
d'd’f d'fnd'f 
— = —,— — md'd" log |\z||” 
f I? 
At a critical point of f, this becomes 
d'd” 
cs —m d'd" log ||z|\? 


From this formula, one can show that the Hessian of f has at least n negative 
eigenvalues at each critical point. We leave this as a (difficult) exercise for 


the reader. 
Next, we consider the application of Morse theory to the study of the 


space of loops on a manifold. Let M be a compact manifold with a given 
smooth Reimannian structure. Consider p andqe M. Let 


QM = {maps u:I > M} 
with the following properties. 


(1) uw is piecewise differentiable, u(0) = p, and u(1) = @. 
(2) u is parametrized proportional to arc length. 


We provide QM with a metric p defined as follows. 
p(u, v) = sup dist(u(t), v(t)) + |Ju — Jo| 
tel 


where Ju denotes the length of u. Then the length function J is continuous. 


We set 
QO°M = {ue QM: Ju <a} 


Let M"=MxMx:---xM (Cartesian product, n times). Define 
go": M" > R by 


n-1 
P'(X1, ces Xn) = 24x: X41)? A d(p, x,)? = A(Xn> q) 


Topology and Differential Geometry 475 


where d denotes the distance function. This is a continuous function but, 
of course, it is not smooth. However, when the points are close there is a 
unique minimal geodesic joining them. This geodesic depends smoothly on 
the end points, so that in the region where the points are close, d” is smooth. 
We will say a positive number a is regular if there is no geodesic from p 
to q of length a. 
Set 


b= and M,” = {x €e M": 6"(x) < b} 


Theorem 3. Suppose a>0Q. Then, for all sufficiently large positive integers 
n, conditions (1)-(6) below hold. 

(1) For any x = <x,,...,X,> € M,” there is a unique minimal geodesic 
joining x: to X;4, (1 <i<n-— 1), a unique minimal geodesic joining p to x,, 
and a unique minimal geodesic joining xX, to q. 


Let B(x) denote the curve (parametrized proportionally to arc length) 
obtained by first following the minimal geodesic from p to x,, then (in succes- 
sion) each of the minimal geodesics from x; to x;,,, and finally the minimal 


geodesic from x, to q. 
Define a: Q°M — M" by the formula 


a(u) = («(4) (abl )) 


(2) The mapping «: Q°M > M,," and B: M," > Q°M are inverse to one 
another in the category Hot, so we have a homotopy equivalence 


O°M ~ M," 


(3) @" is C® on M,". 

(4) B maps the set of critical points of ¢" on M," one-to-one onto the set 
of geodesics from p to q of length <a. 

(5) A critical point x € M," is nondegenerate if and only if p and q are not 
conjugate on the geodesic B(x). 

(6) For a nondegenerate critical point x € M,", the index is precisely the 
number (counted with multiplicity) of conjugate points of p on the geodesic 
B(x). (The notion of a conjugate point and its multiplicity will be defined 
below.) 


Example 


Let M" be the n sphere S” with its usual Riemannian metric. (This is 
the restriction of the usual Riemannian metric on Euclidean n + 1 space, 
E"*! if S" is considered as the unit sphere in E"*'.) Let p be the north pole 


476 RAOUL BOTT AND JOHN MATHER 


and q any point except the south pole. Then q is not conjugate to p. Arcs 
of great circles are geodesics, and a sufficiently small segment of any geodesic 
is an arc of a great circle, but, of course, a geodesic may wind several times 
around the great circle on which it lies. For each nonnegative integer k there 
is exactly one geodesic on S" connecting p and g and passing through the 
poles k times. Each pole is a conjugate point of the north pole each time the 
geodesic passes through it, and its multiplicity (as a conjugate point) is n — 1. 

A positive number a is regular if and only if a # d(p, q) + nk for any 
nonnegative number k, since the geodesic connecting p and g that passes 
through the poles k times has length d(p, gq) + xk. Thus if we take 


d(p,q) + mk <a<d(p,q) + n(k + 1) 

we obtain the following from Theorem 3. 

(a) Q°M ~ M," (n large) 

(b) There is one critical point of ¢” on M,” for each geodesic from p 
to g of length <a, and these critical points are nondegenerate. 

(c) The critical point of ¢" corresponding to the geodesic of length 
d(p, q) + nj (i.e., the geodesic passing through the poles / times) is of index 
(n — 1). 

Thus M,” is a manifold with boundary, provided with a function ¢” 
with only nondegenerate critical points, of index 0, (n — 1), 2(n —1),..., 


k(n—1). The function ¢” takes its absolute maximum b on the boundary 
and is nondegenerate there. It follows from the Morse theory that 


O°M =~ M," =~ eo U en-1 Y e2-1) Gea ©, Cx(n-1) 
a Y 


Since we may apply this with a arbitrarily large, we get 


QS" ~ sr! U 2-1) ery Ck(n-1) 6 ae 
? 


Now we begin the proof of Theorem 3. Since M is compact, there 
exists a 6 > 0 such that there is a unique minimal geodesic y joining any two 
points x, y such that d(x, y) > 6 and such that y depends smoothly on x and 
y. In particular, d(x, y)? depends smoothly on x and y for d(x, y) <6. 
Choose n so that a?7/(n + 1) <6. Then 


d(p, X1)s d(x;, Xi+1)s A(Xns q) < 6 
so that conditions (1) and (3) hold. 
In stating (2), we assumed that the image of « lies in M,” and that the 
image of B lies in Q°M. Let us verify this. 
Let ue Q°M, so that a(u) = <x,,...,x,> where x; = u(i/(n + 1)). 
Then each of the numbers 


d(p, X14), d(x;, Xi+ i) A(X, ’ q) 


Topology and Differential Geometry 477 


is <a/(n + 1), since u is parametrized proportionally to arc length. Hence 


2 
“4 
On the other hand, suppose x e M,”. The Schwartz inequality, applied to 
the (1 + 1)-tuples ¢1,..., 15 and <d(p, x,), d(x,, x2), ..., d(x,,q)>, yields 
J(B(x)) = d(p, X1) a d(x, X2) 5 ala A(Xn5 q) 
< ((n + 1)(d(p, x,)? + d(x, x sae i A(Xn> gyi 
=(n + 1)'/2a(x)!/? <a 


bx) <(n +1) ( 


a 
n+ 1 


To show that « and £ are inverse to one another in the category Hot, we 

will construct homotopies g, connecting «f with 1 and 4, connecting Ba with 1. 

First, we define g,. Let x e M,” and set u = B(x). We may choose numbers 

51, 52, .-., 5, IN a Canonical way such that u(s;) = x;. Precisely, we let s,; = 

d(p, X,)/J(u) and s; — s;_, = d(x;-,, ¥;)/J(u). The fact that uv is parametrized 

proportionally to arc length then guarantees that u(s;) = x,;. We define g,(x) 
= Vit>+++> Yard, where 


(i — a 


ie — U ts; + 
it ( n+1 


Clearly g) =aB and g, =1. Continuity is easily verified. To see that 
g(x) € M,", we proceed as follows. Since u is parametrized by arc length 

1-t 
dus Yor.) $ (Iu)( 65:51 = 5) + —) 


for 0 <i<n, where we set yo ,=P, Yn41 =% So =O, and 5,4, =1. The 
Schwartz inequality (in the form (ta + (1 — t)b)? < ta? + (1 — tb’) yields 


BAG) = Yds Ye? 


(Ju)? 
n+ 1 


< >) (Ju)*(si41 — 5)? +(1 — 2) 
i=0 


(Ju)? 
n+ 1 


= th,(x) + (I _ t) 


Since ¢,(x) < b and (Ju)?/(n + 1) < a?/(n + 1) <b, this shows @¢,(g,(x)) < b, 
as required. 

Now, we define /,. Let ue Q M. On the interval [i/(n + 1), (+ ¢)/ 
(n + 1)](where O <i <n),h,(u) is the minimal geodesic connecting u(i/(n + 1)) 
and u((i+¢)/(n+ 1)), and parametrized by arc length. On the interval 
[(é+o/(n+1), G+ 1)/(n+1)], Au) is u. The required properties are 
easily checked. 


478 RAOUL BOTT AND JOHN MATHER 


Next, we prove (4). For eachr > 0 and each x € M, we let S,(x) denote 
the set of points of distance r from x and B,(x) the set of points of distance 
<rfrom x. Weneed the following elementary fact from Riemannian geom- 
etry. For 6>0 sufficiently small and all xe M, S,(x) is a sphere differen- 
tiably embedded in M, and each geodesic emanating from x is perpendicular 
to S,(x) at the first point of intersection. We suppose that the 6 that we just 
chose satisfies this property. 

Suppose x = <x,,..., x,» € M,". First, we show that if B(x) is not a 
geodesic, then x is not a critical point of #"; that is, 8 maps the set of critical 
points of ¢, on M,” into the set of geodesics. Since B(x) is parametrized 
proportionally to arc length and each segment of f(x) is a geodesic, the only 
way for B(x) to fail to be a geodesic is for the incoming tangent at some x; 
to be different from the outgoing tangent, as in Fig. 9. Assume that this is 


FIGURE 9 


the case. In Fig. 9, we let S_, denote the sphere of radius d(x;_,, x;) centered 
at x;_, and S, the sphere of radius d(x,, x;,,) centered at x,;,,. Then S, and 
S_, have different tangent planes at x; , since the tangent plane to S, (resp. S_,) 
is orthogonal to the geodesic connecting x; and x;,, (resp. x;_, and x;,). 
Then we can find a curve y(t) in S, such that y(0) = x; and dy(t)| Ot ¢ S_,. 
Set 


x'(t) = (X41, weey Migs y(t), Xitioreers Xn 


We claim that 0¢,(x’(t)) | dt < 0, which shows that x = x'(0) is not a critical 
point. In fact, all terms in the sum defining ¢,(x'(t)) are constant except the 
term d(x;_,, y(t))?. Furthermore, if we let z(t) denote the first intersection 


Topology and Differential Geometry 479 
of the geodesic x,_, y(t) with S_,, then 
A(x;-1, Y(t) = d(x;- 1, 2(t)) — d(z(1), y(t) 
= d(x;-1, Xi) £ d(z(t), y(t) 
Since dy(t)/0t ¢ S_,, 


O(d(z(t), y(t))) £0. 
Ot 


From these two equations, 0¢,(x'(t))/dt 4 0 follows easily. This completes 
the proof that B maps the set of critical points of ¢, on M,” into the set of 
geodesics. 

To show that B maps the set of critical points of ¢, on M," onto the set 
of geodesics in 07M, it is enough to show that Ba(y) = y and that a(y) is a 
critical point for any geodesic of length <a. The first of these assertions is 
obvious. We prove the second by contradiction. If a(y) is not a critical 
point, then there is a curve t> y,: ]> M," such that y, = a(y) and dd,(y,)/ 
dt#0. But this is easily seen to imply d/(B(y,))/dt #0. But this contradicts 
the assumption that y = Ba(y) = B(yo) is a geodesic. 

Finally, to show that B is one-to-one on the set of critical points of 
g~" on M,", it is enough to show that aB(x) = x for any such x. In other 
words, we must show that if x is a critical point of ¢” on M,", then 
d(p, X,) = d(x,,X,) =:*: =d(x,,q). Suppose otherwise. For example, 
suppose d(x;_,, X;) > d(X;, X;4,). Let t+ x,(t) denote the minimal geodesic 
from x; to x;_, (so that x,(0) =x; and x1) =x;_,). Let x(t)=~,, ..., 
x(t), ...X,. It is easily seen that 


d n 
5 PR) =o #0 


which contradicts the hypothesis that x = x(0) is a critical point of 6”. The 
other cases are treated similarly. 

This completes the proof of condition (4). 

In order to say what conditions (5) and (6) mean, we must define the 
notion of conjugate point. We let the exponential mapping exp: TM, > M 
be defined as follows. For any XeTM,, let exp X = y(1), where y is the 
unique geodesic with y(0) = pand y'(0) = X._ (Here y’(0) denotes the velocity 
vector of yatt=0.) From the hypothesis that M is provided with a smooth 
Riemannian metric, it follows that y isasmooth mapping. Let y be a geodesic 
connecting p and q. Assume y is parametrized so that (0) = p and (1) = q. 
We say p and g are conjugate along y if y'(0) is a critical point of the mapping 
exp, that is, the rank of d(exp) at y'(0) is <. We say p and gq have multipli- 
city k along y if the rank of exp at y’(0) ism —k. If p and q are conjugate 
along any geodesic, then gq is a critical value of the mapping exp, so by Sard’s 


480 RAOUL BOTT AND JOHN MATHER 


theorem, there exists g € M such that p and gq are not conjugate. It is easily 
seen that the relation of being conjugate is symmetric in the following sense. 
If p and g are conjugate along y, then q and p are conjugate along y, where 
y(t) = y(1 — 2). 

For example, consider the n sphere S”. The conjugate points to the 
north pole are itself and the south pole, and these are conjugate along any 
geodesic (except the trivial geodesic connecting the north pole to itself) and 
have multiplicity n —1. (In general, however, a point is not conjugate to 
itself.) 

We will not prove (5) and (6). For details, see, for example, R. Bott, 
The Stable Homotopy of the Classical Groups (Ann. of Math. 70, 1959), 
pp. 313-337. 


3 HOMOTOPY GROUPS 


Let X be a topological space with base point +. Let * also denote a 
fixed ‘‘base’’ point in the k sphere S,. 


Definition. 7,(X, *) = homotopy classes of maps of S, into X that take * in 
S* to «in X. 

There is a group structure on 7,(X,*) induced by a mapping 
nm: S‘¥+S* v S*. Here S* v S* is formed by taking the disjoint union of 
S* and S“* and identifying the two base points. A rough picture of the map- 
ping x is given in Fig. 10, which 1s supposed to indicate how the outside 
sphere is stretched over the wedge of the two inner spheres. In words, this 
prescription of z could be given as follows. Consider S* as the set of points 


FIGURE 10 


Topology and Differential Geometry 481 


in E**! such that x,2+-+++%,4,2=1. Map the western hemisphere 
W = {x, < 0} into the first copy of S* in S* v S* by a mapping that takes 
S*~! = {x, =0} into the base point and that takes W— S*~! homeomor- 
phically in an orientation-preserving way onto S“—*. Map the eastern 
hemisphere E = {x, > 0} into the second copy similarly. 

Now the group law is defined as follows. Let [a], [B]e2,(X) be 
homotopy classes with representatives a, B: (S*, *) > (X, *). (This notation 
means that « and B are mappings S* > X that take * to *.) Then av B 
maps S* v S* into X¥ so (a v B)°xmaps S*into X. Also (a v B) ° n(*) = +. 
Thus (a v B)° 2 defines a member of 2,(X), which we define to be [«][f]. 
In symbols 


LoJ[6] = [(« v B)o =) 


To show that this is a real definition, one must show that this is independ- 
ent of the choice of representative. Also, one must verify that this defines 
a group law. One would like to be able to give explicit formulas for the 
homotopies, etc., that are involved. Using the definition we have just given, 
this would be difficult. Therefore, we introduce another definition, which is 
technically easier to use, though perhaps not so geometrically appealing. 

Let I" be the cube 0 < x; < 1, in Euclidean n-space E”. Consider maps 
us (I", I") > (X, *). (This means p is a map of I" into X, which maps the 
boundary /" of I" into *.) We let 2,(X, *) denote homotopy classes of such 
maps. Since /"/J" (i.e., I" with I” identified to a point) is S", this may be iden- 
tified with homotopy classes of maps (S", *) > CX, *). 

We define the group multiplication as in Fig. 11. The precise prescrip- 
tion is as follows. Let [«] and [B] be elements of z,(X) with representatives 


a, B: (1", I") (X, *) 


FIGURE 11 


482 RAOUL BOTT AND JOHN MATHER 
Now let y: (/", 1") > (X, *) be given by 
WX Kop cks he Ol 2h hoe ee) 0<x,<4 
=P 2% = Ny oi i he) t$<x,<1 


We define [«][8] as the homotopy class of y. 

Now it is a straightforward matter to check that [a][B] is independent 
of the choice of representatives a and f# and that 2,(X, *) is a group with this 
composition law. 

Furthermore, 72,(X, *) is an Abelian group if k > 2, that is, [«][f] = 
[B}lo). 

If f: (X, *) > (CY, *) is a continuous mapping, then there is induced a 
mapping f*: 2,(X)—>7,(Y), defined by f*([«]) = [foa]. This mapping is a 
group homomorphism. Further, (go f)* = g*o fx and lx =1. In terms of 
categories, this means: 


Proposition 1. 1, is a functor from the category of pointed spaces to the category 
of groups. 


Definition. A covariant functor from category A to category B is a function 
that assigns to each object a in A an object F(a) in B and assigns to every 
“map” f: aa’, a “map” F(/): F(a) > F(a’) such that 


(1) F(fog) = Ff) ° F(g) 
(2) F(l,) = I F(a) 


A contravariant functor from a category A to a catogery B is a function 
that assigns to each object a in A an object F(a) in B and assigns to every 
‘““map” f: a7 a’, a“*map”’ F(f): F(a’) > F(a) such that 


(1) F(fo g) = F(g)° F(f) 
(2) Fla) = 1 ga) 


In Proposition I, “‘the category of pointed spaces”’ means the category 
whose objects are spaces with base points and whose “‘ mappings”’ are the 
continuous mappings that preserve base points. ‘‘ The category of groups”’ 
means the category whose objects are groups and whose ‘*‘ mappings”’ are 
group homomorphisms. 

Let C be any category and A an object of C. Then we have functors 
F and G from A to the category of sets, defined by 


F(X) = Hom(X, A) 


G(X) = Hom(A, X) 


Topology and Differential Geometry 483 


The former is contravariant and the latter is covariant. A functor is said to 
be representable if it is of one of these two forms. For example, 7, is repre- 
sentable by definition. 

Let Qe denote the set of mappings (/, [) > (X, *). We consider the 
constant map as the base point of this space. 


Proposition 2. 


7(X) & TT, — (Q4exX), 
where A is defined by | 


Au)(X1, a8 X~—1)(t) = M(t, x1, ee X~-1) 


Thus for any mapping x: (/“, /“~!) +X, the map Ap “reinterprets ” ys as 
a map of (/*7!, [*—') into QX. 


Proposition 3. Assume p and q are in the same path component of X; that is, 
there exists a curve v: I> X joining pandq. Such a curve induces an isomor- 
phism v«*: ™%(X, p) > %(X, q). This isomorphism depends only on the homo- 
topy class of the path v joining p and q. 

In particular, 2,(X, *) =0 implies there is a canonical isomorphism 
1(X, p) =~ (X, 9). 

We define v« as follows. Let I“ denote the standard cube (given 
by 0<x,;<1). Let C denote the cube }<x,;<}. Let e: CI" be 
the homeomorphism given by e(x,,...,%,) = 2x, —l,...,2x,—1. Let 
[a]en,(X). Define v«[x] = [8], where B: (/*, I“) > (X, *) is given by the 
prescription 


(1) B|C=ace 
(2) Ifyel*, then 


B(ty +1 — the '(y)) = v(t) 


Since each point in /* is in C or on a unique line segment joining a point 
yel* with e~'y, this defines r+[a]. 

As usual, one must check that this is independent of the choice of repre- 
sentative, and that the other assertions in the proposition hold, but we omit 
these details. 


Remark. |f p =q and we are acting on z,, then v#: 2,(X, *) > 7,(X, *) Is 
given by «— [vJa[r]7'. 


Definition. m)(X ) = the set of arc components. 
To(X) is a pointed set (i.e., a set with a singled-out point). 


484 RAOUL BOTT AND JOHN MATHER 


Theorem 1. 1,(S") = 0, fork <n. 

If the map of S* into S" representing a given homotopy class is fairly 
nice, say a polynomial mapping, then the image does not fill the sphere. 
Take a point p not in the image and contract the complement of p to a point. 
This defines a homotopy connecting the given map and a constant map. 
Unfortunately, there is a continuous map of S* onto S", so we must first 
approximate the map by a nice map to make the argument work. 

Consider S* c R**!, S"cR"t!. Then f: S*— S" is given by contin- 
uous functions x,,..., X,4; defined on S*. Approximate these functions to 
within ¢ by p,, ...,; Pat+i, Which are polynomials in the y’s (coordinates of 
R‘*!). Dividing by norm p = ¥ |p;|?, we get a map P: S*-— S" given by 
rational functions. Furthermore, P(y) is close to f(y) for all y € S”, so there 
is a unique geodesic joining P(y) to f(y). This geodesic may be used to 
deform P into f, so we have that P is homotopic to /. 

In particular, f is homotopic to a C® map. The fact that f[S*] 4 S" 
(always assuming k <n) is a consequence of the corollary of the following 
result. Thus Theorem | follows. 


Theorem. Let f: M—WN be a C® map of C® manifolds. Let & denote the 
set of xX € M for which 


rank df, <dim N 


Then f[Z] has measure 0 in N. 

This is usually referred to as Sard’s theorem. For a proof, see Stern- 
berg, Lectures on Differential Geometry, Prentice-Hall, Englewood Cliffs, 
N.J., 1964. Inthe case dim M < dim N, Sard’s theorem is a quite easy con- 
sequence of the mean value theorem. However, in the general situation, the 
proof is difficult. 


Corollary. If dim M < dim N, then f[M] has measure 0 in N. 
PROOF. In this case, £ = M. 


Proposition 4. Let Y= X U,e,. The inclusion i: X + Y induces an isomor- 
phism 1,{X) 31, Y) fork <n—1, and n,_\(X)—> 1,—,(Y) is onto. 

The proof uses the same ideas as the proof of Theorem 1. For example, 
to show that 2,(X) > 7,( Y) is onto for k <n, we take a map of S* into Y and 
approximate it by a map that is differentiable on the inverse image of e,. 
Then by Sard’s theorem the image of the map does not include all of e,. 
Thus we may remove some point from e, and push the remainder into YX. 
This provides us with a homotopy of the given map to a map whose image 
lies in X. Thus 2,(X)— 7,(Y) is onto. 


Topology and Differential Geometry 485 


To show 2,(X ) > 7,( Y) is an isomorphism, we must replace a homotopy 
connecting two mappings of S“ into X by a homotopy that stays in X. Thus 
we must replace a mapping (S* x J, S* x 1) +(Y, X) by a mapping of S* x / 
into X that coincides with the original mapping on S‘ xi. This can be 
done by the same argument, provided k < n — 1, because now we are mapping 
a (k + 1)-dimensional object into our target space. 


Corollary (Freudenthal Suspension Theorem). 2,(S") ~ m -, (S"~'), pro- 
vided k < 2(n — 1). 
(Here ~ means isomorphism of groups.) 


PROOF. By Proposition 2, 7,(S") ~ ™_,(Q4 5S"). By the Morse theory, 
QS" ~ Ss"! U C2(n-1) U C3(n-1) 07° 


It is not difficult to show that 7,(Q, S"~') ~ 2(QS"~'), where Q, S"~! denotes 
the loop space of Proposition 2 and QS"~! denotes the loop space that we 
used in Morse theory. Thus Proposition 4 permits us to conclude the 
corollary. 

Now let us compute 72,(S"). From the Freudenthal suspension 
theorem, 


Ty (S*) ~ m™-1(S*~') k23 


Thus, it suffices to compute 2,(S”) and 2,(S'). 
There are no conjugate points on S', so the Morse theory implies 


QS' ~ pt. U pt. U--- 


Thus QS' has the homotopy type of a countable number of points. These 
points are in one-to-one correspondence with the geodesics from p to q. 

In our Morse theoretical constructions, the only requirement on the 
points p and q was that they not be conjugate. Thus, in this case we could 
take p=g. Thus the geodesics are: the constant geodesic; the geodesics 
that wrap around the circle n times (” > 0) in the positive direction; and the 
geodesics that wrap around the circle n times (” > 0) in the negative direction. 
This sets up a one-to-one correspondence between the integers and z,(S'). 
It is easy to check that this correspondence preserves products. 

Now we want to compare 7,(S7) with 2,(S*). Note that 


S=S'UeU-- 


We could show «~ *, which implies 2,(S7) ~ 2,(S') = Z. However, we 
leave this as an exercise to the reader, and show z,(S7) = Z in another way. 


486 RAOUL BOTT AND JOHN MATHER 


The method that we use is based on the Hopf fibering, x: S* > S?. 

This map 7z is defined as follows. We have 
S? = {<z,, 22> €C?: |z,)? + [22/7 = 1} 

Also S, = CP,. That is, S, is the set of equivalence classes of elements 
in C? — {0}, where <z,, z2> ~ <Az,, 4z,>. Write the equivalence class of 
(21, Z2> aS {Z,, Zz}. Then we define x by 2(<z,, 22>) = {z,, z,}. Clearly 
n~'(pt.) is S'. In fact, each point in S? has a neighborhood U such that 
there is a commutative diagram 


n'y ley x §! 


| | | proj. 


U—eU' 


where h is a homeomorphism. Thus S° is /ocally trivial over S?. 

In n dimension, one can still construct a mapping z: S?""-'+CP_,, 
taking <z,,...,Z,> — {Z,,---, Zn}, Where we consider S*"~' as the unit sphere 
in C” and CP,,_, as the quotient space C” — {0}/C*. Again, this mapping is 
locally trivial. However, it is not very useful in computing the homotopy 
groups of spheres, since CP,,_, is not a sphere, except when n = 2. 

Since the Hopf mapping of S* onto S? is locally trivial and has fiber 
S', we should expect some relation between the homotopy of S?, S', and S°. 

First, let us consider the example of a globally trivial mapping, that is, 
a product. 


Proposition. 1,(X x Y)=1,(X) x m(Y). 


PROOF. For any space Z, Hom (Z, X x Y) = Hom (Z, X) x Hom (Z, Y), 
where Hom denotes the set of continuous mappings. From this, the proposi- 
tion is almost immediate. 

From the fact that 2: S* > S? is locally trivial it follows that given a 
curve u: 1 S? and a point p € S° above u (that is, in 2 '(u(0)), then there 
is a curve v: !— S° covering u, that is, such that zo v = u, and starting at p, 
that is, such that v(0) = p. 


Definition. Consider a map x: E- Bandaspace Z. Then z has the covering 
homotopy property, with respect to ‘Z, if given a commutative diagram (as 
indicated by the solid arrows) 


Topology and Differential Geometry 487 


we can find a mapping F (as indicated by the broken arrow) that makes the 
complete diagram commutative. 

This means: we suppose we are given mappings f:Z = Z x {0} 9 E 
andG: Z x [> Bsuchthatzo f=G]|Z x {0} (i.e., such that Gisa homotopy 
of nof). Then we can find a map F: Z x [> E such that F|Z x {0} =f 
andzcF=G. The first of these last two conditions is expressed by saying 
that F is a homotopy of f. The second is expressed by saying F covers G. 

If this property holds for polyhedral Z, then we say z has the polyhedral 
covering homotopy property and call x a ‘‘fibering in the sense of Serre’”’ 
and it is easily shown that if 7: E — Bis locally trivial, then z is a fibering in 
this sense. 

Given such a fibering, let F = 2 '(*) and let * be a ‘‘ base point” in 
F. Then we have an exact sequence of homotopy groups, 


+ nF, *) 3 1,(E, +) 3 m_,(F, *) 3 ™m_\(E) > 


(Here i: F— E denotes the inclusion.) More precisely, we can define a hom- 
omorphism 0: 2,(B, *) > 1,_,(F, *) that makes the resulting sequence exact. 
A sequence 


ery AM BY C yes 


of group homomorphisms Is said to be exact at B if kernel v = image u, that 
is, v(b) = 0 if and only if there exists ae A such that b=u(a). It is exact if 
it is exact at every place where this definition makes sense. 

The mapping 0: 2,(B, *) > 19(F, *) is defined as follows. Take a loop u 
in the base B, that is, a homotopy of « that ends at *«. By the covering homo- 
topy property, one can find a homotopy @ of * in E covering this homotopy. 
Look at the end point of this homotopy. Define o({u]) to be the arc com- 
ponent of the end point of a. 

In general, o is defined in a similar fashion. 


Example. Let x: E- 8B a proper submersion of differentiable manifolds. 
(A proper mapping f is a mapping such that f~'K is compact for compact K 
inthe range. A submersion isa C' mapping f: E > B such that at eachx € E, 
the differential df,: TE, TB,,,) is onto.) Then z is locally trivial, and 
therefore has the covering homotopy property. 


Proposition. If there exists a section s of nm: E- B (that is, a map s: BoE 
such that mo s = 1), then 0 = 0 in the homotopy exact sequence. 


PROOF. m°S=1 implies m4 °5, = 1. Hence z, is onto. By exactness of 
the homotopy sequence, 0 = 0. 
Thus, we get an exact sequence 


0 1,(F, *)—> 7,(E, *) > 7,(B, +) 0 


488 RAOUL BOTT AND JOHN MATHER 


This is an example of a short exact sequence. Any exact sequence 
0+A+B5C30 


having five terms and zeros on both ends is called a short exact sequence. 
Note that exactness at A means that « is injective (one-to-one) and exactness 
at C means f is surjective. For any groups A and C, there is a short exact 
sequence 


0—-A-Ax C-C-0 
An example of a short exact sequence not of this form is 


0—-Z,—-2,—-Z,—0 


Proposition. If i: F + E is homotopic to a constant map, then we get an exact 
sequence 


0 1,(E)— 1,(B)— 1% - (FF) 0 
PROOF. Since i is homotopic to a trivial map, i, = 0. 


Example. In the case of the Hopf fibration z: S* > S?, the inclusion i: S' + S° 
is homotopic to a constant map. Thus we have a short exact sequence 


0 1,(S*)— 2,(S’?)> 1,(S')—0 


But 2,(S°) = 0, so d: 2,(S7) > 2,(S') is an isomorphism. We know 7,(S') = 
Z, so we now know that z,(S*) = Z. By the Freudenthal suspension theorem 
1,(S*) = Z,k>1. 

Now that we know that 2,(S*) = Z, it is of interest to ask whether 
there is an effective way of telling which homotopy class a given mapping 
lies in. We now construct an isomorphism 7,(S") > Z that provides a fairly 
effective way. We begin with a differentiable map f: S*— S*. By Sard’s 
theorem, the critical values constitute a set of measure 0. (The set of critical 
values is defined as f[Z], where & was the set occurring in the statement of 
Sard’s theorem.) Choose a noncritical value pe S". Then f~'(p) will be a 
finite set of points g,,...,9,- Choose an orientation on then sphere. Then 
it makes sense to say whether df,: TS; > TS, preserves or reverses orienta- 
tion, namely, we say df, preserves orientation if it has determinant > 0 and 
that it reverses orientation if it has determinant <0. Now we define 

d(f)= ) &4q), 
qe f~'(p) 
where e(q) = +1 if f preserves orientation and e(q) = —1 if f reverses 
orientation. 


Topology and Differential Geometry 489 


Theorem. d(f) depends only on the homotopy class of f, and d induces an 
isomorphism 1,(S") > Z. 

The proof of this theorem depends on the Thom transversality lemma. 
Consider the following diagram of differentiable manifolds and mappings 


NSM 


U 


V 


Definition. f is transversal to V if at every point p € Vand everyg ef '(p), the 
composition 
roj. TM 
TN, TM, > ——? 
TV, 


is onto. ° 

If f: N— M is transversal to V, then f~'V is a submanifold of N of 
dimension = dim N —dim M+dim V. In other words, codim f~'V= 
codim V. (Thecodimension of a submanifold is the dimension of the ambient 


manifold minus the dimension of the submanifold.) 


Thom Transversality Lemma. Suppose M is compact. If f: M— Nis C’, then 
there exists a map g: M > N transversal to V such that g together with its 
derivatives of order <r approximates f together with its derivatives of 
order < r arbitrarily closely. 

Furthermore, one can show that if K is a closed subset of M and / is 
transversal to V on K, then one can choose g so that g|K = /| K. 

Now we sketch the proof that d(/) is independent of the homotopy 
class of f and of the choice of p. First suppose f’ is homotopic to f and p 
is not acritical value of f’. Let F be a smooth homotopy connecting fand /’. 
(Thus F maps S" x J into S"; Fp = fand F, =/f’.) By the Thom transvers- 
ality theorem we can assume that F is transversal to p. Then F '(p) is a 
one manifold whose boundary lies in S” x J, as in Fig. 12. Then each point 
gq in f~'(p) is connected by an arc in F~'(p) to either a point in f~ '(p) S 
S" x {0} or a point in f’” '\(p) < S" x {1}. It is easily seen that if g and q’ 
in f—'(p) are connected by an arc in F~'(p) then f preserves orientation at 
one of the two points and reverses it at the other. This is also true with f’ in 
place of f. If qef~'(p) and q’ ef” '(p) are connected by an arc in F~'(p), 
then / preserves or reverses orientation at p according to whether f’ preserves 
or reverses orientation at p’. From these facts, it follows that d(/) = d(/’) 
(where d(f) and d(/’) are defined using the same p). 

Next, consider a single map f and two points p and p’, neither in the 
critical set of f, Let / be a smooth curve connecting p and p’. By changing 
f (but keeping it in the same homotopy class), we may suppose that’ / is 


490 RAOUL BOTT AND JOHN MATHER 


FIGURE 12 


transversal to /. Then f~'(/) is a one manifold whose boundary is f~'(p) U 
f~'(p’). Again, one has to check that a curve joining a point in f~'!(p) 
with one in f~‘(p’) preserves orientation and that others reverse orien- 
tation. Then the independence of d(/) from the choice of p follows. 

By the same methods, one can show that the map d is a homomorphism. 
In fact, with a little more work, using the same ideas, one can show that the 
map is an isomorphism. 


Example. Brouwer Fixed Point Theorem. Given a continuous map 
f: D" > D", where D" = {Xx,,..., X,>: © x? < 1}, there exists x € D, such 
that f(x) =~. 

If f(x) = x, we say x is a fixed point of f. 


PROOF. Suppose, to the contrary, there exists f with no fixed point. Let A(x) 
denote the place where the ray from f(x) through x meets the boundary 
S"—' of D" (Fig. 13). Then 4: D"> S"~' is continuous / «i= 1, where 
i: S"~' — D" is the inclusion. Hence the composition 


iyai(S” 1) Sa, 4(D) aS) 


is the identity; that is, 4, oi, = 1. But z,_,(S"~') = Zand z,_,(D") = 0, so 
this is clearly impossible. This contradiction proves the theorem. 
Our next example is z,(U,), where U, denotes the unitary group. 


Proposition. If G is a Lie group and H is a closed subgroup, then the projection 
G—>G/H is a fibering in the sense of Serre, with fiber H. 


Topology and Differential Geometry 491 


h(x) 


FIGURE 13 


As an example, consider U,_, ¢ U,, as all unitary 1 x n matrices of 
the form 


1 0 ee 0 
O xy r,s i ro | 
0 Xn-1,1 rere Xnon 


Then U,/U,_, = S?""'. Thus the mapping x: U(n)> S*"~' is a fibering 
with fiber U,_,. The homotopy exact sequence for this fibering is 


M4 (Son—1) 7 UCU 1) > H(U ) > M(San- 1) 7°" 


Fork + 1 < 2n — 1, this implies z,(U,,_,) = 2,(U,) = ™(U,41) =°**. Hence 
n,(U,) is independent of n asn— oo. We call the common value of z,(U,) 
for n large, z,(U). Similarly, 2,(O,,) is independent of n for large n, so we 
may define 2,(O) = 2,(O,), n large. Also x,(Sp,) is independent of n for n 
large, so we may define z,(Sp) = 2,(Sp,), n large. 

As an aside, we note that the exact sequence for a fibering can be used 
to calculate ,(S'). We have a fibering exp: R > S', which takes ¢ to e’®. 
The fiber of this map is in one-to-one correspondence with the set Z of integers. 
Thus, we get an exact sequence 


1,(Z, *) > 1,(R, *) > 1,(S', *) > 1,_,(Z, *) > 1 — ,(R, *) > ™%_,(S', *) 


Since R is contractible to a point, z,(R, *) = Oforall k, and since Z isa discrete 
union of points z,(Z, *)=0 fork >0. Hence x,(S', *)=0 fork > 1. 
Now consider the Hopf sequence 


S'—+ S34 S$? 


492 RAOUL BOTT AND JOHN MATHER 


Then we get an exact sequence 
13(S') + 23(S*) > 13(S*) > 1,(S') 
Since 2,(S') =2,(S') =0, the map 73(S°)—7,(S*) is an isomorphism. 
Hence z,3(S7) = Z. 
Spheres other than S' have nonvanishing homotopy in infinitely many 
dimensions. 


4 THE CLASSICAL GROUPS 


Let G be aconnected compact Lie group. Such a group has a Rieman- 
nian structure that is invariant under right and left translation. In fact, if 
we start with a positive definite symmetric bilinear form B on the Lie algebra 
g of G, the bilinear form B’ defined by 


B(X,Y)=[| B(Adg-X,Adg- Y) dui) 


géeG 
(where p is Haar measure) is a positive definite symmetric bilinear form that 
is invariant under the adjoint action. Then we get a right and left invariant 
form on the group by right translating this form around the group. 

Let’s study QG. To apply the Morse theory, we have to understand 
what the geodesics are and know what all the conjugate points are. 

(1) Let 7 be a maximal torus in G and let p be a generic point of T 
(that is a point such that {p": 1 € Z} is dense in 7). Then all geodesics from 
e to p in G are given by geodesics on T. 

(2) Let g=h +p be the orthogonal decomposition of the Lie algebra 
of G into the Lie algebra of 7 and its orthogonal component. 

Then for any te 7, Ad t maps p into itself. Let Ad*(t) denote the 
restriction of Ad(t) to p. What are the conjugate points on T? Let y bea 
geodesic on 7 starting at e, that is, a one-parameter subgroup of T. Consider 
the function s > rank (Ad*(y(s)) — 1) from the real numbers R into the non- 
negative integers. This function takes its maximum at all but a discrete set 
of points. It turns out that the conjugate points to e along y are the points 
where this function drops below the maximum and the multiplicity is the 
amount that this function drops. In particular the multiplicity is always an 
even number, since rank (Ad*(t) — 1) is always even. 


Example. SO(3) 


A maximal torus T is 


cos@ sin@ O 
M(@)={-—sin@ cos@é 0 
0 0 1 


Topology and Differential Geometry 493 


The Lie algebra g of SO(3) is the set of skew 3 x 3 matrices. The Lie algebra 
f in g corresponding to this maximal torus is spanned (as an R vector space) 


by 
01 0 
-1 0 0 
000 


Its orthogonal complement p is spanned by 


00 1 0 0 0 
a= | 0 0 } a= (; 0 1 
—-1 0 0 0 -!1 0 
Then 


Ad M(0)(yA, + 242) = M(0)(yA, + 2A2)M(0)! 
= y(cos 0 A, + sin @ A,) + z(—sin 0 A, + cos 0 A,). 


Thus Ad* M(@) — 1 has rank 2, except where 0 is a multiple of 27, where it 
has rank 0. 

In Fig. 14 we unwrap 7, and display the geodesics beginning at e and 
ending at an arbitrary point p that is not e. Each arrow in the figure repre- 
sents a geodesic and the number written above it is the number of conjugate 
points it crosses, multiplicities counted. Thus, the Morse theory yields 


QSO(3) = &o U er Ver U ++ @ eo Ver VU: 
where ® denotes disjoint union. In particular, we obtain 


%){QSO(3)} = n,SO(3) = Z, 


FIGURE 14 


494 RAOUL BOTT AND JOHN MATHER 


Using these techniques, one can show that for any semisimple Lie group 


NG =e U--UeVe.U':-UeU-:: 


As a consequence, we get that 7,(G) is finite, 7,(G) = 0, and 2,(G) is free. 


Example. SU, 


Here a maximal torus is 


mn=(o 


A computation similar to that which we did for SO(3)shows that Ad* M(0)7! 
has rank 2, except at multiples of z, where it has rank 0. 

We now mimic our diagram for SO,. That is, we unwrap the maximal 
torus and represent the geodesics connecting e with an arbitrary point p that 
is neither e nor —e. Again, we write the number of conjugate points that 
each geodesic crosses above the arrow that represents the geodesic. Wecan 
read off the Morse theoretical decomposition of QSU, from the diagram. 


Namely, 
OSU, © e@)9) Ve, Ue, U--: 


Note that SU, is the twofold covering of SO, and QSO, is the disjoint 
union of QSU, with itself. As an exercise, the reader might verify that if 
X and Y are connected spaces and _ X is an n-fold covering of Y, then QY is 
(up to homotopy) the n-fold disjoint union of Y with itself. 


Example. SU; 


In SU,, the maximal torus is 


et QO 0 
0 ei82 0 :0,+0,+ 0; =0 
0 OO. e@% 


Again we can unwrap the maximal torus. Since it is two-dimensional, we 
get a plane. The set of ¢ for which Ad*r — 1 has rank < 6 consists of three 
families of parallel straight lines equally spaced (Fig. 15). For ¢ on any of 
these lines but not on an intersection point, the rank of Ad*1—1is4. Onan 
intersection point the rank of Ad*t—1isO. Thus if p isa general point in 7, 
we can determine the Morse theoretical decomposition of QSU, by marking 
all the points in the diagram corresponding to p, drawing a line segment from 
e to each of the points, and counting the number of times this line segment 
crosses one of the solid lines. 


Topology and Differential Geometry 495 


FIGURE 15 


Remark. In a compact connected Lie group, all maximal tori are conjugate, 
so we get essentially the same picture no matter which maximal torus we pick. 

Now our aim is to show how to use Morse theory to compute the stable 
homotopy groups. For this, we will need to generalize the notion of non- 
degenerate critical point. 


Example. Consider a torus lying on the table with its height function. Pre- 
cisely, what we want to consider is obtained by rotating the following around 
the z axis: a circle in the xz plane that does not intersect the z axis. This is 
atorus T. Let fdenote the restriction of the z coordinate to 7. The maxima 
of fconstitute an S', and sodothe minima. Thus the maxima and the minima 
are degenerate in the sense of our earlier definition. However, the circles of 
maxima and minima are nondegenerate in the sense of the following definition. 


Definition. Let V < M be a critical manifold for fon M (i.e., a submanifold 
such that df=0 on it). Then V is nondegenerate if and only if Hf has the 
tangent space to V as its null space; that is, Hf is nonsingular in the normal 
directions to V. 


Proposition I. The number of negative equivalues of Hf at p € V is independent 
of the point p € V, if V is a connected nondegenerate critical manifold, we call 
this number the index 7(V) of V as a critical manifold. 


Proposition 2. If V is a nondegenerate critical manifold of f, then f can be 
liggled in the vicinity of V so as not to affect other critical points of f, and in 
such a manner that f becomes nondegenerate near V, with points of index 
> A(V). 


496 RAOUL BOTT AND JOHN MATHER 


One way of doing this is the following. Let 7V be a tubular neighbor- 
hood of V with projection xz: TV + V. Let g be a nondegenerate (in the old 
sense) function on V. Let Ww be a C®™ function on the ambient manifold V/ 
that vanishes outside of 7V and is identically 1 on V. Then f+ eW(go 2) is 
nondegenerate for some « # 0. 


Proposition 3. Let G be a compact Lie group or, more generally, a compact 
symmetric space. Let pandq be arbitrary points on G. Then the critical sets 
in Q,! are all nondegenerate manifolds. 

Here we use 22,7 to denote the space of piecewise differentiable curves 
from p to g, which we called QG in our section on Morse theory. What we 
have said here is very imprecise; what is meant is that critical sets in M," 
have the property that we have stated. 

For example, consider the 1 sphere. If p and gq are not antipodes, 
then the critical sets are nondegenerate points. If p and g are antipodes, 
then the critical sets are nondegenerate manifolds, beginning with S"~!. 

The idea for computing the stable homotopy groups of the classical 
groups is to study nondegenerate critical manifolds in loop spaces. 

Recall the definition of the stable homotopy groups. 


n,(U) = 2,(U,,) n large 
n,(O) = 2,(O,) n large 
7,(Sp) = 7,(Sp,) —_n large 


Let us also note some other relations. As a topological space, U, 
decomposes as the product SU, x S'. Thus the problems of computing 
m,(U,) and 2,(SU,) are equivalent. 

Spin (”) is the twofold covering of SO,, so, the fibration 


Z,——~>Spin (7) 


SO 


n 


gives an isomorphism z, (Spin (n)) = 2,(SO,) for k > 2. Recall that O,, U,, 
and Sp, are the groups of linear transformations in R", C", and H” preserving 
the inner product (x, y) = ) x, y;. One uses H to denote the quaternions. 
The symplectic group Sp, can also be described as the set of unitary 2” x 2n 
matrices such that 


UU SJ 


where the superscript ¢ denotes transpose, and 


J=(4. 6) 


Topology and Differential Geometry 497 


The fact that z,(Sp,) is constant for large 1 is a consequence of the existence 
of the fibration 


Sp nae Sp 


n+ 1 


Sant 


The existence of this fibration is a consequence of the fact that Sp, is the 
group of linear transformations of H" preserving the inner product (x, y) = 


Y Xi. 


Proposition. Any connected Lie group is topologically the product of its maximal 
compact subgroup and a space homeomorphic to Euclidean space. 
For example, 


Gl(n, R) = O, x {positive definite symmetric matrices} 
Gl(n, C) = U, x {positive definite Hermitian matrices} 
Gl(n, H) = Sp, x {...}. 


Thus, as far as algebraic topology is concerned, it is enough to consider 
compact groups. 


Theorem. 
m™(U) = 1% 42(U) 


1,(O) = 1, 44(Sp) 
1,(Sp) ~ %+4(O) k=0, ee 


Now we begin a sketch of the proof. Consider the space Q7]SU,, 
of piecewise smooth paths beginning at / and ending at —/. This accords 
with our philosophy of considering very degenerate loop spaces. Consider 


the arc p 
_ . 0 | 
0 err. 
in SU,,. 


A maximal torus for SU, is the set of diagonal matrices. 

The arc p is the shortest geodesic between J and —/. Furthermore, 
the singular set of conjugate points on 7 is the union of the tori on which two 
entries are equal (e.g., the ‘th and /th entries). 

Thus 1 does not cross any of the singular tori. To compute the index 
of what we are attaching, we have to count the number of times the arc crosses 
the singular tori forO <1? <1. 

Observe that if x is anything in U,,,, then the curve 


t— xu(t)x7' 


498 RAOUL BOTT AND JOHN MATHER 


is another geodesic segment from J to —J in SU,,. Thus 


x— {t—xu(t)x7'} 
is a map. 
U,,— {geodesics going from | to —I}. 


(* 0 
0 x2 


with x, e U,, then xu(t)x~' = u(t). Hence, our map factors through 


Note that if x is of the form 


U,,/U, x U,, 


The resulting map 
U,,/U, x U,— {geodesics going from I to —T} 


is One-to-one and onto the set of shortest geodesics going from Jto —/. This 
is a consequence of the ** spectral theorem”’: 

(1) Every element of G is in some maximal torus. 

(2) All maximal tori are connected if G is a compact connected Lie 
group. 

Now the modified Morse theory (where one considers nondegenerate 
critical manifolds rather than nondegenerate critical points) yields 


QOSU,, — U,,/U,, x U,, wns 


and the problem is to find the lowest-dimensional cell one needs to attach 
where we have written the three dots. (By ‘‘jiggling’’ the Morse function a 
bit, one can guarantee that one needs only attach ce//s to the first critical 
manifold U,,/U, x U,. That is, one can ‘jiggle’? the function a bit to 
guarantee that all the critical points not on U,,/U, x U, are nondegen- 
erate.) The important point is that the size of the cells that have to be 
attached tends to oo asn goes to oo. Thus, we obtain 


TL{QSU 2,} = T,{U2,/U, x U,3 


for k small relative to 1, since homotopy in low dimensions is not affected by 
adding cells of high dimensions. This formula is equivalent to 


(a) T+ 1{SU an} = M{U2,/U, x Un} 
Now we will show 


(b) TCU) = Mya 1{Urn/ Un x Un} 


Topology and Differential Geometry 499 


and we will be done. We have a choice of two methods here. We can use 
the same techniques applied to symmetric spaces rather than to groups and 
get 

Q(U,,/U,, x U,) = U, U Cwn) 


where v(n) tends to oo withn. This is the method we have to use in the case 
of the orthogonal group; in fact, we have to apply it eight times. However, 
in this case there is an alternate method for step (b), which is much more 
elementary. Apply the exact sequence coming from the fibration 


UF —— 5G 


U,,/ U,, 
This sequence is 


t,(U,)—>1,(U>,)— 1, (U2,/U,) 7 0, -(U,) => — (Ua) 


The precise form of our statement that 2,(U,,) is stable for large 1 is that we 
have isomorphisms in the indicated places for k small relativeton. It follows 
from this exact sequence that 2,(U,,/U,) =0 for k small relative to a. 
But now we have another fibering 


U, x U, 
U, = U, — \" U,, 
U,,/U,, x U,, 


In general, an inclusion Gc Hc K of Lie groups gives rise to a fibration 


H/K—>G/K 


G/H 
It follows that we get an exact sequence 
mal a )(U2,/U,)— TK + 1(U,,/U, x U,)72(U,)— (U2,/U,)— aoe 


Since z,(U;,/U,) = 90 for low k, we get the isomorphism (b) for low k. 

Combining (a) and (b), we get the periodicity for z,(U). It is not too 
difficult to compute z,(U) and z,(U). After this is done, we get x,(U) for 
all k by periodicity. Thus 


Corollary 
n,(U) = 0 k even 


ey fs k odd 


500 RAOUL BOTT AND JOHN MATHER 


in the case of the orthogonal group, this game doesn’t work, so we have to 
write down the formulas for the models. We get 
QSO,, ~ SO2,/U, °°: 
QSp, = Sp,/U, U- 
Q{Sp,/Ug} & UylOn 0 
Q{U 2,/Orn} ~ Orn/On X O, UV °° 
ASO sq/Urn} ~ Ury/SPy U0 
Q{U4n/SPan} = SPan/SPa X SP, YU - °° 


where the cells indicated by the dots have dimensions tending to oo as n 
tends to oo. The periodicity theorem follows from these relations, together 
with the elementary relations 


Q{Sp,,/Sp, x Sp,} = Sp, 
Q{0,,/0, x O,} = O, 


(Recall that 2,(QX) = 7,,,(X), and attaching a high-dimensional cell does 
not affect homotopy in low dimensions.) 


5 COHOMOLOGY 


From the point of view of homotopy theory, cohomology arises from 
considering homotopy classes of maps of a space into a group. The natural 
definition of group in the homotopy category is the following: A group is a 
pair (G, m), where G is a space with base point * and m: G x GG is a map 
satisfying the following axioms. (We will work entirely with base point- 
preserving maps.) 

(1) The following diagram is commutative: 


mx | 
GxGxG—C6xG 


1 x #7 1 


GGG 


(2) There exists a map «a: G-—G such that the composition 


«a, 1> m 
G— Gx G— 


is the constant map (in the homotopy category). 
(3) The composition 


G=*xG-G 
is the identity (in the homotopy category’. 


Topology and Differential Geometry 501 


Note that if we were working in the category of topological spaces and 
continuous mappings instead of the homotopy category, these axioms would 
be equivalent to the ordinary axioms for a topological group. 

However, these are considerably weaker than the axioms for a topolog- 
‘ical group. For example, Q«X is easily seen to be a group in the sense of 
homotopy as we have defined it, but there is no reason why it should be a 
topological group in general. 

If (G, m) is a group in the sense of homotopy and _X is a space, then the 
set [X, G] of homotopy classes of mappings of X into G is an honest group. 
The product in [X, G] is defined as follows: If [], [g] € [X, G] have represent- 
atives f, g: X ~G then [f][g] = [m.<f, g>], where <f, 9: X > Gx Gis 
defined by ¢f, g>(x) = <f(x), g(x)>. 

We will define the cohomology H"(X; 2) as the set of homotopy classes 
[X, K(x, n)] for a suitable group K(z, 7). This looks dual to our definition 
of homotopy: z,(X) = [S", X]. Let’s see what is dual to the exact sequence 
for a fibering. 

A fibering was defined as a map z: E> B with the following property: 
Given a map /: X > E and a homotopy g, of 2 0 f, there exists a homotopy /, 
of f that covers g,. 

This can be represented by the following diagram. 


_S 
uf |" 
“ B 

x Gr 


Now, if we reverse all the arrows, we get the condition for a map 7: K > L to 
be a co-fibering. 


The condition is this. Given any map f: L— X and any homotopy g, of 
f | K, there exists a homotopy /, of f such that /,| K = g,. 

For reasonable spaces (e.g., simplicial complexes), this condition implies 
i is One-to-one. 

Just as one can write down exact sequences in homotopy for fiberings, 
One can write down exact sequences in cohomology for co-fiberings. It is 
much easier to compute cohomology, since ordinary spaces are naturally 
built up by attaching cells and attaching a cell yields a co-fibration. That is, 
the inclusion i: X > X uU, e is always a co-fibration. 


502 RAOUL BOTT AND JOHN MATHER 


Let i: X — Y bean inclusion that is aco-fibering. Let G be a group and 
let F(Z) = [Z, G] for any space Z. Then it is easily verified that the sequence 


F(X) — F(Y) — F(Y/X) 


is exact. 
Let x be an arbitrary Abelian group. Let K(x, n) bea complex such that 


1,(K(7, n)) = 0 k#n 
= k=n 
These spaces are very nice objects to map into. Most spaces are tough to 
map into—we can not even classify maps of spheres into them. 
For example, S' is K(Z, 1). 
Another example is CP,, = CP, UCP,::-UCP,U:::. There is a 
fibration 


S! Sila 


! 


CP, 


that gives rise to an exact sequence 
1™(S') + 2,(S?"* *) + 1,(CP,) + ™-1(S') > °° 
It follows that for k <2n + 1, 
m(CP,,) = 0 k#2 
= Z k=2 
Thus, we obtain 
m(CP,,) = 0 k#2 
= Z k=2 
In other words, CP,,, is a K(Z, 2). 
One can prove the existence of a K(x, n) as follows: Write 
n=(Z@®°:'@®Z)/R 


Construct a bouquet of 1 spheres, one for each copy of Z appearing in the 
formula above. (A bouquet of spheres is constructed by taking a disjoint 
union of spheres and identifying their base points.) 

Now kill everything in R by attaching n+1 cells. Kill the n+ 1 
homotopy by attaching n + 2 cells, the n + 2 homotopy by attaching n + 3 
cells, andsoon. After attaching cells in infinitely many dimensions, one gets 
a K(n, n). 

Since 1,(QX) = m,4,(X), one obtains: QK(z, n) is a K(n, n — 1). 


Topology and Differential Geometry 503 


It can be shown that if G is a group in the homotopy category, then NG 
is an Abelian group in the homotopy category. 

Now define the n cohomology group H"(X, nz) of X with coefficients in 
m by 


H"(X, m) = [X, K(x, n)] 


Cohomology has the following properties. 
(1) H"(X, zx) is a contravariant functor in X. 
(2) Given a co-fibering 


X7~Y-(Y/X) 
then there is an exact sequence 
H"*'(X) ] Ht 1(Y) HY Xe H"(X) © H"(Y) — H"(Y/X) 


Why should this cohomology look like what we ordinarily think of as 
cohomology with cochains, coboundaries, and soforth? Let’s try to compute 
H"(X), where X is a simplicial complex. 


FIGURE 16 


Given a map /: X > K(x, n) (Fig. 16), we can find a homotopic map f’ such 
that all n — 1 simplices are mapped into base points. Define a function ¢ on 
n simplices. 


d(c,) = homotopy class of f’ restricted to a, 
Then one verifies that ¢ is a cocycle, that is, the sum of all (¢,) as o, runs 


over all faces of ann + 1 simplex t,,, is 0. But this follows from the fact 
that /’ restricted to the union of these faces is zero, since it extends over T,,, 1. 


504 RAOUL BOTT AND JOHN MATHER 


If we deform f in another way we get a different ¢, call it ¢’. Their 
difference is the coboundary of a lower-dimensional cochain 


o- 4 =5, 
Thus we get a map 
[X, K(x, n)]— cocycles/coboundaries 


This map is an isomorphism. 

One has certain rough duality features between cohomology and homo- 
topy. For example, cohomology of a product is complicated (and of a fiber 
bundle even more so), while the homotopy of a product may be simply 
expressed in terms of the homotopy of the factors. On the other hand, the 
cohomology of a bouquet of spheres is very simple, while the homotopy is 
complicated. 

This duality is best understood if one thinks of the K(x, n)’s as a set of 
basic building blocks for the category Hot dual to the cells that we are led 
to at the very start. 

Indeed, let 7), 7,, ... be the sequence homotopy groups of a topological 
space X. Then the space 


X' = K(%, 0) x K(m, 1) x +++ x K(m,, 0) x °°° 


will have the same homotopy groups as_X, and it of course would be marvel- 
ous if X and X’ were of the same homotopy type. This is not true in general. 
What is true—and was already eluded to in the first lecture—is this. 

One may construct a sequence of spaces E,, E,, ..., E,, together with 
maps f,: X + E,,, such that 

(1) f, induces isomorphisms in homotopy up to and including the dimen- 
Sion n; 

(2) E, is of the homotopy type of a fiber space over E,,_, with fiber a 
K(n,,, 1). 

Note that (1) can easily be shown to imply that f, induces an isomorphism 
of the homotopy classes [P, X] of maps of P into X with [P, E,] for all 
polyhedra of dim <n—1. Thus—as far as can be observed with polyhedra 
(i.e., small spaces of dimension < n — 1)—X can be identified with E,. On 
the other hand (2) implies that 


E, = E,-, x K(a,, 7”) 
T 


so that by induction, E, is a twisted product of the K(z;, i), i<n. 


Topology and Differential Geometry 505 


The construction of the {£,} and {/,} is in principle quite simple. We 
set E, equal to X with all homotopy groups in dimension >n killed. Thus 
one attaches cells inductively to kill all homotopy above n 


E,=X U ens2 U': 


That this can be done follows from Proposition 4 of Section 3. We attach 
the cells systematically. That is, having constructed EF, we construct E,_, 
by killing all the homotopy groups of E, in dimension >n. Thus 


E,-1 = E, Uen41 pea 


and in particular, E,_, is a subset of E,. 

The first property again follows directly from Proposition 4 of Section 3. 
To see the second one, we must convert the inclusion map E,_, > E, into a 
fibering. This can always be done in the following manner. 

Let A be a subspace of the space X and let 7: A — X be the inclusion 
map. Let A denote the space of maps u: ] > X subject to the single boundary 
condition u(0) € A, and define 2: A> X by x(u) = u(1). 

It is then pretty obvious that (1) the inclusion of A into A given by the 
point paths: a > u,, where u,(t) = a, t € [0, 1], induces a homotopy equivalence 
of i with 2; and (2) that z is a fiber space in the sense of Serre, with fiber over 
x € X the space of maps u: I X subject to 


u(0)EA u(1) = x 
Applying this construction to the inclusion 
E,—>E,- 
and using the homotopy sequence for the resulting fibering 
E> E,-1 


we find that the fiber has vanishing homotopy except in the dimension n and 
there has homotopy 2,(X). Hence it is a K(z,, 7”). Q.E.D. 

The sequence of spaces {E,,} just described is called a Postnikov system 
for X. As one by now has a deep understanding of how the cohomology of 
a fiber space is put together out of the cohomology of base and fiber (the 
spectral sequence), and knows explicitly the cohomology of all K(z, n)’s, 
one can use the Postnikov system to compute a certain amount of information 
about the homotopy of X, and this is the most powerful general method devised 
so far for the computation of homotopy groups. 


506 RAOUL BOTT AND JOHN MATHER 


6 VECTOR BUNDLES AND CHARACTERISTIC CLASSES 


Recall that we talked vaguely about twisted products and things like 
that. However, we went from the notion of twisted product to its consequence 
in homotopy (Serre fibration), since that was all that mattered before. Now 
we go back to the more precise notion of bundles. 


Definition. Vector Bundles. The intuitive notion is a family of vector spaces 
parametrized by a topological space X. The prime example is the family of 
tangent spaces to a manifold. 

A vector bundle consists of a space E, a map z: E— X such that 

(1) z~'(x) is a vector space of dimension k over either R or C. 

(2) There is a covering {U} of X and k sections {s,', 5,7, ..., s,*} over 
each Ue {U} such that the map 


@:U x R'>n7'(U) 
defined by 


P(x, Pi, eeeg ry) = > r; Syi(x) 


is a homeomorphism. 

(By a section s over U we mean a continuous map s: U+27'U such 
that z7.s=1.) Such a set of sections s,', 5,7, ..., 5,x will be called a frame 
over U. 

Note that if U and V are two different sets in the cover then two frames, 
one on U and one on V are related by a linear transformation on the overlap. 

Note also that our map defines a commutative diagram: 


Ga U 
Example 1 
A trivial bundle 
Xx R=E 
X 


On this bundle, there is a ‘“‘frame”’ of k linearly independent sections. 


Topology and Differential Geometry 507 


Example 2 


A nontrivial bundle 


<I, =X) 


<0, x> 


\——————_——e | 


For E, we take J x R/~, where <0, x» ~ <1, —x)> forall xe R. For B, we 
take //~, whereOQ ~ 1. Thus BisanS'. This isa bundle, since locally there 
isa frame. However, there is no global frame. 


Example 3 


Let V be a vector space. Let G,(V) denote the set of k-dimensional 
subspaces of V. There is a natural way of putting a topology on G,(V). 
We construct a vector bundle E as follows: We let EF denote the set of all 
<v, A> with 4eG,(V) and re A. This is a subset of V x G,(V), so we 
provide it with the induced topology. This is a vector bundle over G,(V), 
with the projection map <v, A> > A. 


Example 4 


G,(R*) consists of lines in the plane. The bundle in this case is isomor- 
phic to that of Example 2. 

Let E and F be vector bundles over X. Then E is isomorphic to F if 
and only if there is a diagram 


2. 
je Ie 


Xx 


where @ and Ww map each fiber into itself and are linear on fibers, and ¢@ o w = I, 


Wog=l. 


508 RAOUL BOTT AND JOHN MATHER 


We write E, = 27 '(x), F, = 1'~'(X). Define a functor Y-w- F(X), 
as follows. We let F,(X) denote the set of isomorphism classes of n-dimen- 
sional vector bundles over X, with the trivial class singled out. 

Given f: Y— X and a vector bundle z: E— X, we define 


f'E= {<e, y>: ne) =f(y)} 


Now we make F, into a contravariant functor by setting 


F,(f)(E) =f 'E 


A basic question is, given a vector bundle by geometry, can you deter- 
mine which isomorphism class it is in? For example, let / be a manifold 
and 7M the tangent bundle. If 7M is trivial, we say M is parallelizable. 


Theorem I, Let X and Y be compact spaces. Let f,g: X + Y be mappings that 
are homotopic. Then ¥F,( f) = F,(g). 
(That is, F,(X) is a homotopy invariant of X.) 


Theorem 2. F., is representable, at least on finite simplicial complexes. 


This means there exists a space B, and a natural equivalence 
F(X) ~ LX, B,] 


where [X, B,] denotes the set of homotopy classes of maps of X into B,. 
(However, B, is not a finite complex.) This theorem ** reduces”’ the classifi- 
cation problem of vector bundles to a homotopy problem. In general this 
homotopy problem is very difficult. However, in the case X is the sphere, we’ 
can apply our results on the homotopy groups of the classical spaces. First, 
some definitions. 

Given a vector bundle 


E>X 


then E x R constitutes a vector bundle of one higher dimension over X, 
with 2’: Ex R-X defined by z'(e)=7x(e). This defines an operation 
F(X) > Fi4,(X). Wecall £ and F stably equivalent if in some F, ,,(X ) they 
become isomorphic. 


Theorem 3. The stable classes of vector bundles over X (X a finite complex) 
form an Abelian group K(X). 


From the fact that the F, are functors, it follows that K is a functor. 
This is the basic functor in K theory. 


Topology and Differential Geometry 509 


Now we will sketch a proof of Theorem 1. Let F: ¥ x 7— Y bea 
homotopy of X in Y and let x: E— Y bea bundle over Y. Set 


F(x) = F(x, t) 


for all xe X and te J. Theorem | states that the isomorphism class of 
F~'E is independent of ft. It is enough to show that the two bundles over 
X x I, namely, F~'E and x7 'F,'E are isomorphic, since 


Pe B=1. FE 
and 
Fo'E=1, 'x”'Fo'E 


where 1,: X > X x J is the inclusion defined by 1,(x) = <x, ft). 

We are given an isomorphism of £, = F~'(E) and E, = x7'F7'E on 
the set X¥ x 0. We-can construct a new vector bundle W, whose fiber is 
W, = Hom(E, ,, £,.,). The isomorphism between E, and E, on X x 0 is 
a special sort of section over X¥ x 0. Now we use the following. 


Proposition. Given a section of a vector bundle over X x 0, we can extend it 
to a section over X x I. 


The method of proof is to patch local solutions of this problem with a 
partition of unity. 

Now we extend our section over X x J. The fact that our section gives 
rise to an isomorphism in each fiber over X x 0 can be expressed as the non- 
vanishing of a certain determinant. Then there is an open neighborhood U 
of X x 0 such that this determinant does not vanish on U; that is, the section 
gives rise to an isomorphism in each fiber over U. Since X is compact, 
U2 X x (0, e] for e > 0 sufficiently small. Therefore 


FUE|X x [O,e]) 2a 'Fo'/|X x (0, ¢] 
The same argument shows that for any re J, 
FUE|X x[t-et+e]~en'F |X x [t-—ette] 


Since J is connected, this yields the desired result. 
Let’s go back to 


F(X) ~ [X, By] 


To any EeF,(X) we associate a map f;: XY > B,. Thus f,*: H*(B,)> 
H*(X) is obtained. Anything in the image of f,* is called a characteristic 


510 RAOUL BOTT AND JOHN MATHER 


class of E. Any characteristic class of a trivial bundle is 0, since the corres- 
ponding map is homotopic to aconstant map. Thus the characteristic classes 
are obstructions to trivializing a bundle. 

Let’s see how to generalize the notion of vector bundle. First, let’s 
observe that the notion of vector bundle can be reformulated in terms of the 
general linear group. 

Let E be a vector bundle over X, and let {U} be a covering, with frames 
5,", ..+, 5," defined over a member U of the cover {U}. 

Let U, Ve {U}. Then 


$;(x) = » 9 i(x)5; (x) 


for a suitable matrix g;(x). Note that g;(x) is defined for all xe Un Vand 
that it is nonsingular. Thus we have defined a function 


g=g9"":UnNV>G\(n) 


(We take the i — jth entry of g(x) to be g;,(*).) 
The following is easily verified. 


Proposition. On UAW VAW 


(x) J w(X)G vw(X) oa Juw(X) 


We call this formula the cocycle condition. 


Proposition. Given the covering {U} and the cocycle {g"”}, we can reconstruct 
E, up to isomorphism. 


In fact, given any covering {U} and any I-cocycle (relative to this cover- 
ing) with values in G/(n, R), we can construct a vector bundle E as follows. 
(By a cocycle, we mean a family g”’ satisfying g"” = g"°g’”.) 

(1) Form the disjoint union of all U x R", where U ranges over {U}. 

(2) For any U, V in {U} identify the subset (UAV) x R" of U x R" 
with the subset (U 7 V) x R" of V x R" by identifying (x, v) with (x, (g"”)~ ‘v), 
for any xe UO V and any ve R’. 

This sets up a one-to-one correspondence between bundles with speci- 
fied trivializations over the {U} of our covering and cocycles with values in 
Gi(n, R). 


Remark. A \|-cochain is a function that is defined on the I-skeleton of 
a simplex. The coboundary oc is a function on next-higher-dimensional 


Topology and Differential Geometry 511 


simplices given by the formula 6c(123) = c(23) — c(13) + c(12). We call 
ca cocycle if dc = 0. 


But our cocycle condition is the same formula written multiplicatively, 
except that our index set is an open cover of the space, rather than the vertices 
of a simplex. However, one can think of a simplicial complex in terms of 
open sets, by attaching to each vertex its openstar. This is the set of all (open) 
simplices that have the given vertex in their boundaries. 

One can thus define I-cocycles with values in any group. However, 
to define higher cocycles, one needs an Abelian group. 


Proposition. Consider two sets of local frames {s;"} and {5;"}, associated to 
a cover {U}. Then we have matrix-valued functions 


A,: U — Gl(n, R) 
uniquely defined by s;" = A5,". 


The new g”’ are related to g” by A"g” =g""A". Note that this is the 
non-Abelian analogue of being cohomologous. 

In this way, one can construct an isomorphism between the set of 
isomorphism classes of vector bundles F,(X) and the set H'(X, Gi(n, R)) 


consisting of equivalence classes of 1-cocycles with values in G. The H 
means Cech cohomology, and we underline Gi(n, R) to indicate that we are 
talking about sheaf cohomology. 
We can replace Gi(n) by any topological group G, and define H'(X, G), 
in the same way. An example of how this fact can be used is the following. 
We have an exact sequence 


0 > Z, + Spin(n) > SO, > 0 


512 RAOUL BOTT AND JOHN MATHER 


of topological groups. From our experience in cohomology, we would 
expect to get an exact sequence of cohomology groups. This is the case. 
One gets an exact sequence 


H'(X, Z,) 5 H(X, Spin(n)) > H'(X, SO,) > H?(X, Z») 


Geometrically, one can think of an element of H'(X, SO,) as an oriented 
bundle with a Riemannian structure. An element of H'(X, Spin (n)) can be 


thought of as a bundle with a spin structure. Thus an oriented bundle E 
with an oriented Riemannian structure has a spin structure if and only if 
6E=0. Thus, we have shown that the existence of a spin structure is equiva- 
lent to the vanishing of a certain second cohomology class. Note that the 
ambiguity in defining a spin structure is measured by i*H'(X, Z,). 

Let I'(£) be the set of continuous sections of E. (Examples of this are 
tensor fields on a manifold.) Now if £ is constructed from the cocycle g”, 
then I'(E) corresponds to twisted functions on X. Precisely, a section s in 
T(E) corresponds to a set of functions 


such that on UN V, 
Fulx) = g(x) f(x) 


Example 


On a manifold, let {U} be a cover by coordinate neighborhoods with 
corresponding coordinate functions x,', ..., x,". Then on the overlap 
U © V of two coordinate neighborhoods, we define 


moo = (55) 


This defines the cotangent bundle. All tensor bundles on M are generated 
from J“(x) by representations of Gi(n, R). (A representation of Gi(n, R) 
is a homomorphism p: Gi(n, R)—- Gi(m, R).) That is, any tensor bundle is 
given by the cocycle p(J“(x)) for suitable p, where J“'(X) is given by (*). 

However, one can construct other bundles that have nothing to do with 
the Jacobian determinant. 


Example 
Bundles with group G/(1, R) are classified by H'(X, Z,) since G/(1, R) = 
Z, x Rand 
H1(X, Z, x R) = H(X, Z,) 


Topology and Differential Geometry 513 


In general, let G be a connected Lie group and let K be its maximal 
compact subgroup. Then H'(X, G)  H'(X, K). 

Now, let us try to construct m-dimensional vector bundles over 5S", 
that is, bundles with structure group Gi(m, R). Let U = S" — south pole, 
V = S"—north pole. Then UN V=S""!xR. Note that U and V are 
contractible. According to Theorem | of this section, if U and V were 
compact, we could conclude that any vector bundle over U or over V is trivial. 
Actually, Theorem | is true in much more generality than we have stated it, 
and in fact it has an extension that includes this case. Thus the restriction 
of our bundle to either U or V is trivial. It follows that the bundle is deter- 
mined up to isomorphism by the mapping g”’: S"~=' x R>Gi(m, R). This 
construction leads to a one-to-one correspondence between 7,_,(Gi(m, R)) 
and isomorphism classes of m-dimensional bundles on S". For details of 
the proof of this correspondence see N. Steenrod, Topology of Fiber Bundles 
(§18), Princeton University Press, Princeton, N.J., 1951. 


Corollary. Isomorphism classes of m-plane bundles over S" are independent of 
m when m is large compared to n. 


As a matter of fact, we have already seen this result in more generality 
(Theorem 3), since the set of isomorphism classes of m-plane bundles over X, 
for m large, is essentially the same as what we called K(X). 

From the knowledge of the groups 2,(0), we obtain Table | for K(S*) 


TABLE 1 
k module 8 K(S*) 


Z2 
2 Z2 
3 0 
4 Z 
3 0 
6 0 
7 0 
8 Z 


Thus for m large, there are two isomorphism classes of real m-plane bundles 
on S!, two on S?, one on S°, infinity on S*, one on S*, S®, and S’ and infinity 
on S°. 

We have shown 


F,(S*) ~ [S*~', Gl)n, R)] 


514 RAOUL BOTT AND JOHN MATHER 


Recall now that we already asserted that for an arbitrary finite complex XY one 
has 


F(X) = LX, B,] 


for a suitable B,,. 

The idea behind the construction of B, is the following. We want to 
construct a bundle x: E-— B, such that for every bundle xz’: E’ +X, there is 
a unique (up to homotopy) mapping f: X > B, such that E’~ f7'E. Start 
with x’: E’-» X, X compact. Let {U} be a finite open cover of X, {s,'} a 
family of local frames and let {p,,} be a partition of unity subordinate to {U}. 
Then for each 7 and each U we set 


oy (x) = p,(x)sy(x) xeU 
= 0 xexX—U 
Then a,;' is a global section of E. These global sections generate a vector 


subspace of the space I'(E) of sections of E. Now, given x € X, we let E, be 
the fiber over x. Define 


e: T(E) E, 


by e(a,') = e,'(x) and linearity. Then e is a linear map of V onto E,. Thus 
we get an exact sequence 


* 0+ K,>V>E, 70 


Thus x > K,, is a mapping of X into G,(V), where G,(V) denotes the Grass- 
mannian whose points are the subspaces of codimensionn. Call this mapping 
f. Over the Grassmannian, there is a God-given bundle, called the quotient 
bundle Q, and 


E=f~'Q 


The bundle Q is defined as follows. Let 4eEG,(V). Then we have an 
exact sequence 


0-A-V—-V/A-0 


The fiber of Q over a point A is V/A. 

One finds F(X) ~ [X, G,(V)] for some high-dimension V. One again 
proves a stability theorem: [X, G,(V)] is independent of V if the dimension 
of V is large enough. Out of this, one constructs B,. 

Now we know 


F,,(S") = 1,(Gl,(V)) 


2 
T,-1(Gl(m, R)) 


Topology and Differential Geometry 515 
Combining these, we get an isomorphism 

T,-1(Gl(m, R)) © 1,(G,,(V)) 
But this isomorphism also comes out of the exact sequence for the fibration 


O(m) + O(k + m)/O(k) 


| 
O(k + m)/O(k) x O(m) 


since 1,-,(O(m) = 1,-,(Gl(m, R)), m,-1(O(k + m)/O(k)) = 0 for k large, and 
O(k + m)/O(k) x O(m)ishomeomorphictoG,,(V), if Vis(K + m)-dimensional. 


XIX 


Continuous Solutions of 
Linear Equations—Some 
Exceptional Dimensions 
in Topology 


BENO ECKMANN 


1 Vector Products and Continuous Solutions 517 
1-1 Vector Products 517 
1-2 Nonexistence Theorems 518 
1-3 Continuous Solutions 519 
1-4 Reduction 520 
2 Continuous Multiplications in R"*' 521 
2-1 Adams’ Theorem 521 
2-2 Division Algebras 521 
2-3 Parallelizability of Spheres 522 
2-4 Almost-Complex Structures on Spheres 522 
2-5 Complex Vector Products 523 
3 Proof of Theorem A and General Remarks 523 
3-1 Pseudoprojective Planes 523 
3-2 Properties of KCX) 524 
3-3 Proof of Theorem A’ 525 
3-4 Remarks 526 
References 526 


This is a report on some elementary problems in real n-space R" which 
admit solutions for a few exceptional values of n only. The results are stated 
in an elementary way, without reference to algebraic topology; the proofs, 
however, rely heavily on the technique of homology and general cohomology 
theory, and relate these results to a deep theorem in the topology of poly- 
hedra. 


516 


Continuous Solutions of Linear Equations 517 


1 VECTOR PRODUCTS AND CONTINUOUS SOLUTIONS 


1.1 Vector Products 


We first recall some properties of the usual “‘ vector product” a x b of 
vectors a, b in R?: 


(i), a x bis bilinear in a and b. 
(11), @ x b is orthogonal to a and 5, with respect to a positive-definite 
inner product <, > in R°. 
(iii), the length |a x b| = <axb,ax b>'” is given by |a x b/? = 
Ja|?|b|? — <a, b>’. 


By a continuous vector product of r vectors in R", 0 <r<n, we under- 
stand a function x(a,,..., a,) of r variable vectors a,,..., a, € R" with range 
R" having properties analogous to those above, with (i), replaced by con- 
tinuity; i.e., a function satisfying 


(i) x is continuous in a,,..., a, € R". 
(il) <x(a,,..., 4,), aj» = 0 for j= | ee 3 
(iii) |x(a,,..., a,)|? = Determinant of <a;, a)>;,;=1,... +. 


< , > denotes again a positive-definite inner product in R". The usual vector 
product a x b in R® is an example for n = 3, r= 2. An example for 7 even, 
r=1 can be given explicitly: Denote by a',..., a" the components of the 
vector a e€ R" in a fixed orthonormal basis of R”, and let x(a) be the vector 
with components a?, —a', a*, —a°®,.... For arbitrary n and r=n—1l, a 
(multilinear) vector product is well known in linear algebra: consider the 
n x (n— 1) matrix of the components of the vectors a,,..., a,_, € R", and 
take as components of x(a,,..., a,-,) the (n — 1) x (n — 1) determinants of 
that matrix, in suitable order and with suitable signs. 

There exist a further (bilinear) example for n = 7, r = 2; it is related to 
the Cayley number multiplication in R® in the same manner the vector 
product a x b in R? is related to the quaternion multiplication in R*. We 
recall that the quaternion multiplication in R* can be described as follows: 
a vector Ae R* = R@R? is written uniquely as A=a+a, ae R,aeR’; 
then the quaternion product A- B, A=a+a,B=f8 + bis given by 


A: B=aB—<a,b>+ab+ Bataxb (1) 


It is immediately checked that this is indeed in agreement with the usual 
multiplication table of quaternions; the element A = | (that is, «= 1, a=0) 
is a two-sided unit. We note that the “norm product rule” valid for 
quaternions can be viewed as a consequence of (ii) and (iii): one extends the 


518 BENO ECKMANN 
inner product of R* to R* by <A, B> = aB + <a, b>; then the length of 
A - Bis given by 
|A - B|? = «2h? — 2aBK<a, b> + <a, b>? + «?|b|? + B? al? + 2aB<a, b> 
+ jax bl? 
= «7B? + a?|b|? + Bal? + <a, b> + lal? |b|? — <a, b>? 
(a? + |a|?)(B? + [b|7) = |A]?|B I? 


This equality 


|A- Bl? = |A]?|B/? 


is called the norm product rule. 
One may, of course, define a x b in terms of the quaternion multiplica- 


tion: we take, in (1), « = B = 0 and obtain 
a‘ b=—-—a,b>+axb (2) 


Using this as a definition of a x b, the associative law for quaternions yields 
a:(b-b) =(a-b) - b, where the left-hand side is in R®, since b-b = —|b|?ER; 
the right-hand side is equal to(—<a, b> +a x b)-b = —<ax b, b> — (a, b>b 
+ (a x b) x b, hence <a x b,b> =0, and similarly <a x b, a) =0. The 
above computation of |A - B| shows that then the norm product rule implies 
(iii): Ja x b|? = Ja|?|b|? — <a, b>?. 

Now we turn to the Cayley multiplication in R® = R @ R’; it has all the 
properties which were just used for quaternions: A = | is a two-sided unit; the 
norm product rule holds; although the associative law is not valid for the 
Cayley numbers, its weaker form (A - B)- B= A: (B- B) used above 1s valid. 
Thus a:b = —<a,b)+axb defines ax b in R’ with all the required 
properties. 

The Cayley numbers can further be used to define a (multilinear) vector 
product of 3 vectors in R°, cf. [7] or [6]. 


1.2 Nonexistence Theorems 


In the case r = I, the procedure given above to construct an example 
for n even obviously fails if n is odd, and it is clear that in this case a solu- 
tion x(a) which is linear in a cannot exist [for <x(a), a>) =O implies that 
<x(a), b> + <a, x(b)> =0 for all a,beR", and thus for n odd x(a) has 
determinant 0, which contradicts (111)]. But even if x(a) is allowed to be 
continuous instead of linear, no such function can exist; for 1t would con- 
stitute on the unit sphere S"~' given in R” by |a| = 1 a continuous tangent 
unit vector field, and by a classical theorem of algebraic topology such a field 
does not exist on a sphere of even dimension. This 1s just the simplest case 
of the nonexistence theorem which is one of the objectives of this lecture 
(Theorem | below); it tells that except for the dimensions n and r which occur 


Continuous Solutions of Linear Equations 519 


in the examples described in 1.1, vector products of r vectors in R" cannot 
exist. Similar nonexistence theorems for related problems, allowing a few 
exceptional dimensions only, will be described in Section 2. 


Theorem 1. Continuous vector products of r vectors in R” exist only in the 
cases n even, r= 1; n arbitrary, r=n—1;n=7,r=2; andn=8, r=3. 
The proof of this theorem is discussed in 1.4 and in Section 3. 


1.3 Continuous Solutions 


An equivalent version of the existence problem of vector products can 
be given in terms of continuous solutions of linear equations. We consider a 
system of r linear equations: 


n 
>) ay x, = 0 S41 ie fT 
k=1 


in n unknowns x,,..., X, With real coefficients, and we ask for a solution 
X1,--., X, depending continuously on the coefficients a;,, defined and non- 
trivial for all those real values of the a,,,..., a,, for which the matrix has 
maximum rank r; that is, we ask for a system of continuous real functions 
X~ =SlQiis «++5 Gpn)) defined and without common zeros for all admissible 
real values of the a;, and satisfying the identities 


D Gin felis «+> Gn) = 0 b= Mesiagit 


If such a solution exists, 1t can always be renormalized in such a way that 
2 
p Xk — Det {a;, a> ij= 1, eS 2 
=1 


where <,> Is a positive-definite inner product in R” and where a; denotes the 
vector given in an orthonormal basis by the components a;,,..., a;,, 7 = 1, 

..,r. Ifthe functions f,(a;;,..-, 4,,) are then extended to all real values of 
the a;, by setting f, = 0 if the rank of the matrix is <r, they remain con- 
tinuous. The vector x with components x, = /,(a,,..., a,) is then a vector 
product of r vectors in R". 


Theorem I'. A system of r real linear equations with r unknowns admits 
continuous solutions only for the values of n and r listed in Theorem 1. 


In many applications solutions of linear systems should depend con- 
tinuously upon the coefficients, or even in a more restricted way (differen- 
tiably, etc.). According to Theorem 1’ this is possible—unless one is in the 
exceptional cases n, r—only if the domain of the coefficients a;, is restricted in 


520 BENO ECKMANN 


some way; or in different terms, if the solution is allowed to become trivial 
for some values of the coefficients. This statement can, of course, also be 
given a positive version: Let the pair of integers n,r with O<r<n be 
different from those in Theorem 1; if m continuous real functions 
X, =f, (G11, +++, Gn) defined for all real n x r-matrices (a;,) of rank r satisfy the 
identities 


DF fils ts ++ +s An) = 0 f= awash 
=1 


then they must have a common zero. 


1.4 Reduction 


Lemma. If there exists a continuous vector product of r vectors in R", then 
there also exists one of r — 1 vectors in R"~?. 


PROOF. Let x(a,,...,4a,) be the vector product in R"; consider R"~! as a 
subspace of R" and let R' = Re be its orthogonal complement, with respect to 
<,>. If we write x(a,,...,4,,e)=x' +x", x'eR"”', x” eRe, one easily 
checks that x’ as a function of a,,...,a,_, is a vector product in R"~! 
(relative to the induced inner product). 

To establish nonexistence (Theorem 1), it is therefore sufficient to do so 
for the lowest values of r. 


The case r=1 being settled, we have to consider r= 2. This case 
essentially reduces to the existence problem of continuous multiplications in 
R"*!: If there exists a continuous vector product of two vectors in R", then there 
exists in R"*' a continuous multiplication A - B with two-sided unit and with 
norm product rule. 


Indeed, if a x b denotes the vector product in R", and if 4e R"*! = 
R @ R’ is written as A = a+ a,aeER, ae R’, then formula (1) above yields a 
continuous multiplication in R"*' with A = 1 being the unit, and where the 
norm product rule follows from (11) and (111) exactly as shown above for the 
quaternions. 

Multiplications in R"*’ of the required type are discussed in Section 2; 
the main result (Theorem A) states that they exist (if and) only ifn+1= 
1, 2,4, or 8. Therefore a vector product of two vectors in R"” can only exist 
for n= 3 or 7; according to the reduction lemma a vector product of r 
vectors in R” can only exist ifn —r=1lorn—r=5. Forn=7,r=2 and 
for n = 8, r = 3 (bilinear) vector products exist. To complete the proof of 
Theorem 1, one has therefore just to show that for n = 9, r=4 no vector 
product can exist; this can be done by a simple cohomology ring argument 


Continuous Solutions of Linear Equations 521 


(cf. [6]) in the Stiefel manifold V, 5. The main burden of the proof, how- 
ever, is transferred to the proof of Theorem A, sketched in Section 3. 


2 CONTINUOUS MULTIPLICATIONS IN R"*! 


The length |A| of A €¢ R"*' and the norm produced rule always refer to a 
positive-definite inner product in R"*?. 


2.1 Adams’ Theorem 


Theorem A. A continuous multiplication in R"*! with two-sided unit and 
with norm product rule exists only for n+ 1 = 1, 2, 4, 8. 

In the four exceptional cases multiplications of the required type do 
exist: the product of reals, of complex numbers, of quaternions, and of 
Cayley numbers have a unit and fulfill the norm product rule (for a suitable 
inner product). These multiplications are even bilinear; Hurwitz’ classical 
theorem, the analog of Theorem A for the bilinear case, states that this can 
only occur in the dimensions n + 1 = 1, 2, 4, or 8. Thus Theorem A simply 
says that the possible dimensions remain the same if bilinearity is replaced by 
continuity. 

Theorem A was first proved by Adams [1]. A very simple proof has 
been given by Adams-Atiyah [2] (cf. also [4] for a similar version which we 
use in Section 3 to give a sketch of the proof). The simplified proofs are 
based on the technique of K-theory and on some fundamental properties of 
the Chern character; for an introduction to these techniques see for example 
[4] and the reference given there. 

As shown in 1.4, our Theorem | on continuous vector products (con- 
tinuous solutions of linear equations) 1s essentially a corollary of Theorem A. 
The present section is devoted to further corollaries; they emphasize the great 
importance of Adams’ theorem. 


2.2 Division Algebras 


R"*' is turned into a division algebra, if a bilinear multiplication A - B 
is given which has no zero divisors. Note that the norm product rule is not 
required (but if it holds, there are, of course, no Zero divisors). Without loss 
of generality, we can assume a bilinear multiplication to have a two-sided unit 
of length 1. If A- Bis replaced by 4- B|A||B|/|A-B| for A, B #0, the 
new multiplication will satisfy the norm product rule. It will, however, have 
lost bilinearity, but remain continuous. Thus Theorem A implies 


Theorem 2. R"*' admits a division algebra structure (if and) only ifn + 1 = 
1, 2,4 or 8. 


522 BENO ECKMANN 


2.3 Parallelizability of Spheres 


The unit sphere S" < R"*! is said to be parallelizable, if there exist n 
tangent vector fields on S” which are linearly independent at every point 
Xe S" (X = unit vector e R"*'). If this is the case, one can replace these 
fields by orthonormalized ones. The components of X and those of the n 
vector fields, with respect to an orthonormal basis of R"*'!, taken as rows 
form an orthogonal (n+ 1) x (n+ 1) matrix C= (c;,(X)), the elements 
c,(X) being continous functions of X; we take the components of X as first 
row Cy, =X,- Let y,, K=1,..., be the components of Ye R"*! and 
define Z € R"*! by its components z,, 


Ze = Den XY; a nO 
j= 


Writing X - Y for Z, this is a continuous multiplication in R"*! (for |X| = 1, 
Y arbitrary) with |X¥- Y|=|Y]|. We: extend it to all XYeR"t! by 
X>- Y= |X|((X/|X|)- Y) for X¥ #40 and0- Y=0. Then the norm product 
rule holds, and the vector E=(1,0,...,0) is a right unit. By a suitable 
choice of the basis one may assume C(E) to be the unit matrix, that is, that E 
is also a left unit. Thus.Theorem A implies 


Theorem 3. S" is parallelizable only for n = 1, 3, or 7. 


Remarks. (a) In these three cases, the classical multiplications in R*, R*, and 
R® actually yield a parallelization of S", by vector fields which are linear in X. 
(b) The multiplication Z = X- Y above is linear in Y and continuous in X. 
Thus Theorem A has not been used in its full generality. Other proofs of 
Theorem 3, not depending on Theorem A, have been given earlier (see, for 
example, [5]). 


2.4 Almost-Complex Structures on Spheres 


Let M be a complex-analytic manifold; its tangent bundle admits a 


complex vector bundle structure: Multiplication by ./ —1 of complex vector 
components in an admissible local complex coordinate system defines, at each 
point pe M, a linear transformation J(p) of the real tangent space M, at 
p, with J(p)? = —identity, and J(p) does not depend upon the complex 
coordinate system used. In fact the field J(p) characterizes the complex- 
analytic structure given on M. If ona real differentiable manifold M such a 
field J(p) of linear transformations with J(p)? = — identity is given, independ- 
ently of any complex-analytic structure, M is called an almost-complex 
manifold (and J(p) an almost-complex structure on M). 


Continuous Solutions of Linear Equations 523 
Theorem 4. S" admits an almost-complex structure only for n = 2 or 6. 


PROOF. If an almost-complex structure J is given on S""'! c R", acontinuous 
vector product a x b can be defined in R": for ae S""‘ and a unit tangent 
vector at a (that is, for a and b € R" with |a| = |b| = 1, <a, b> = 0), we con- 
sider the oriented tangent two-plane determined by 5 and J(a)b and choose 
a x bto be the unit vector orthogonal to } in that plane and corresponding to 
the orientation; the product a x 5 is then easily extended to all a and b € R"” 
so as to fulfill the properties of a vector product. Thus + 1 must be = 1, 
2, 4, or 8, that is, n = 1, 3, or 7, and hence the dimension of the sphere is 2 
or 6. In these two dimensions, almost-complex structures actually exist. 


2.5 Complex Vector Products 


Theorem 5. A continuous vector product of two vectors in C”, with respect to 
a positive-definite Hermitian inner product in C", only exists for n = 3. 

The definition of the complex vector product is of course entirely 
analogous to the real case; the proof of Theorem 5 uses a construction similar 
to (1) to yield a continuous multiplication in C"*! with two-sided unit and 
with norm product rule (with respect to the Hermitian length of vectors). 
This can be interpreted as a multiplication in R?"*? with two-sided unit and 
norm product rule. Hence, 2n + 2 = 2, 4, or 8, that is, n = 3. 


3 PROOF OF THEOREM A AND GENERAL REMARKS 


In this section we make free use, without reference, of concepts and 
results in algebraic topology of polyhedra (some of the material 1s covered in 
the course of R. Bott at these Rencontres); and further of K-theory and some 
of the fundamental properties of the Chern character of complex vector 
bundles. (For an introduction to these topics see, for example, [4] and 
references given there.) 

In the proof of Theorem A the trivial case n = 0 will always be excluded. 


3.1 Pseudoprojective Planes 


If in R"*’ a continuous multiplication A-B with unit E and with 
norm product rule is given, we consider the map f: S” x S" + S" defined by 
/(A, B) =A B, |A| = |B| = 1 (that is, (A, B) is considered to be a point 
of S" x S"). The map f restricted to S" x E is the identity of S” (that is, a 
map of degree 1), and so is f restricted to E x S". By an elementary con- 
struction due to Hopf one obtains from f a map F: S2"*' > S"*" having 
“Hopf invariant one.” F is used to attach to S"*! a (2n + 2)-cell e7"*? by 


524 BENO ECKMANN 


mapping the boundary of e?"*? to S"*' by F (cf. Bott’s course). The poly- 
hedron (finite CW-complex) ¥ = S"*' |), e"*? may be called the pseudo- 
projective plane over R"*! relative to the given multiplication; for m = 1 and 
the C-multiplication in R*, X is the complex projective plane CP,, and the 
above construction is a simple generalization of the usual procedure which 
leads from the affine plane C? to CP,. 

The cohomology algebra with integer coefficients H(X) of this poly- 
hedron X is easily computed from the explicit cell structure: it is the poly- 
nomial algebra Z[a] over the integers Z in a certain element ae H"*'(X), 
modulo the ideal (a°) generated by a?°, 


H(X) = Z[aj(a*) ae H"*"(X) 


the generator a corresponds to the fundamental cohomology class of 
S"*!< X. Theorem A is then an immediate consequence of the following 
purely cohomological theorem (which, in its turn, admits further interesting 
generalizations, see [2]). 


Theorem A’. A finite polyhedron X with H(X) = Z[a]/(a*), ae H"*'(X), 
exists only for n + 1 = 2, 4, or 8. 

From the anticommutativity of the graded algebra H(X) it is clear that 
n+ 1 must be even; we putn+ 1 = 21. 


3.2 Properties of K(X) 


We now consider the group K(X) for an arbitrary finite polyhedron X; 
it is the Grothendieck group of complex vector bundles over X, or equiva- 
lently, the group of homotopy classes of maps of X into QU (the “loop space 
of the infinite unitary group,” cf. Bott’s lectures). We will use a few standard 
properties of the contravariant functor K, listed below. 

The tensor product of complex vector bundles defines in K(X) a com- 
mutative ring structure with unit. There exists a natural transformation 
ch, called the Chern character of complex vector bundles, of K(X) into 
A(X ; Q), the cohomology algebra of X over the rationals Q ; it maps K(X) 
into even-dimensional cohomology only: 


chz=adgp +a, +°*'+42;+ °°: a,,€ H*(X ; Q) 


for z€ K(X), and it is a ring homomorphism. Furthermore there exist 
natural transformations w, : K(X)— K(X), the Adams operations, one for 
each k = 1, 2, 3,..., which are ring homomorphisms and related to the Chern 
character by 


A ee ey eee eee eee 


Continuous Solutions of Linear Equations 525 


where ch z = dy + a, +++: + 43; ++: aS above with a,,¢ H74(X;Q). The 
yw, are defined by means of exterior powers A, of complex vector bundles; in 
particular, y,z =z? — 2A,z. 

If the polyhedron X is without torsion, that is, if H(X’) over the integers 
has no torsion, ch: K(X) > A(X ; Q) = H(X) @ Q has kernel 0. Moreover, 
inchz=d) +a, +°''+4,;+°°:, the first nonvanishing term, say, a,,, is 
integral, that is, it lies in the subgroup H?'(X) c H2'(X; Q), where the 
integral cohomology is identified with its natural image in H?'(X; Q) = 
H?'(X) @Q; any given element of H?‘(Y), i=0, 1, 2,..., appears in this 
way as the first nonvanishing component of ch z for some z € K(X) (these last 
statements constitute the Atiyah—Hirzebruch integrality theorem in its sim- 
plest version). 


3.3 Proof of Theorem A’ 


In the case of Theorem A’ we consider a polyhedron X with H(X) = 
Zla}/(a*), H(X ; Q) = Q[a]/(a*), ae H2"(X), 2/=n+ 1. There exists an 
element ze K(X) with 


chz=a+ia® ieEQ 
For this z the properties listed in 3.2 yield the following relations: 
ch z? = a? 
ch w,z = k'a + Ak?'a? 
ch (W,z — k'z) = Ak'(k' — 1)a? = Ak'(k' — 1) ch z? 


with Ak'(k' — 1)e Z. Since the kernel of ch is 0, w,z =k'z + Ak'(k' — 1)z?. 
In particular, for k = 2, we have 


woz = 2'2 + 42'(2' — 1)z2 = 27 —2A,2z 
therefore 12'(2' — 1) is odd, and thus Ae Q must be of the form 


_ odd 
~ odd: 2! 


For an arbitrary odd integer k > 1, (odd/odd - 2/) - k'(k' — 1) is an integer, 
which means that 2! divides k' — 1. 

Thus from the existence of a polyhedron X satisfying the assumption of 
Theorem A’, 1 + 1 = 2/, we conclude that 2' divides k' — 1 for all odd integers 
k2>1. 

From this purely numerical result one easily draws very precise conclu- 
sions. First we note that for /=1 the property holds, and we therefore 
assume/>1. Taking k = 2'~! + 1, the fact that 2’ divides (2'~! + 1)'-1 = 
2'- integer + /2'~' implies that /iseven,/= 2m. Taking k = 2” + 1, the fact 


A 


526 BENO ECKMANN 


that 2?" divides (2"+ 1)?"—1=2?"- integer + 22” implies that 2?" 
divides m2™*!, that is, 2"~' divides m. This isso for m = | and 2 only, whence 
/=2or4. Together with /= 1, we thus have n + | = 2/ = 2, 4, or 8, which 
proves Theorem A’. 


3.4 Remarks 


We finally remark that Theorem A can be given the following interpreta- 
tion: With regard to the existence of multiplications in R"*' with unit and 
with norm product rule, the dimensions which are exceptional (in the sense that 
such a multiplication exists) in the continuous case are also exceptional in the 
bilinear case (the latter is solved by Hurwitz’ classical theorem, that is, by 
elementary algebra). The proof reproduced above, although it is very 
simple, does not explain directly why this is so. The situation is similar in 
various other problems of that type; for example, in the vector product 
problem (continuous solutions of linear equations) from which we started; or 
in the problem of finding the maximum number of independent tangent vector 
fields on a sphere, also solved by Adams [8], and in the application thereof to 
linear families of nonsingular matrices [9]. It would be most interesting to 
find an a priori reason why the existence of “‘continuous solutions” in all 
these cases implies the existence of “linear solutions”’ (or, in any case, simple 
algebraic solutions). 1f such a reduction principle from the continuous to the 
algebraic case could be established, it might clarify the exceptional character 
of certain low dimensions n of the real n-spaces. 


REFERENCES 


1. J. F. Adams, Ann. of Math. 72 (1960), 20-104. 

2. J. F. Adams and M. F. Atiyah, Quart. J. Math. 17 (1966), 31-38. 

3. B. Eckmann, Comm. Math. Helv. 15 (1942/43), 318-339. 

4. B. Eckmann, Cours CIME (cohomologie et classes caractéristiques) (1966). 
5. M. Kervaire, Proc. Nat. Acad. Sci. USA (1958), 280-283. 

6. G. W. Whitehead, Comm. Math. Helv. 37 (1962), 239-240. 

7. P. Zvengrowski, Comm. Math. Helv. 40 (1965/66), 149-152. 

8. J. F. Adams, Ann. of Math. 75 (1962), 603-632. 

9. J. F. Adams, P. Lax, and R. S. Phillips, Proc. A.M.S., 16 (1965), 318-322. 


XX 


Differentiable 
Dynamical Systems 


STEPHEN SMALE 


An article with the same title appears in the Bulletin of the 
American Mathematical Society 73, 747 (1967). 


527 


XXII 


Characterization of 
Stable Mappings 


JOHN N. MATHER 


We let N and P be C® manifolds without boundary. We suppose N 
and P are finite dimensional, have countable bases for their topologies, and 
are Hausdorff. We set n = dim N, p = dim P. 

For each nonnegative integer k, we let J“(N, P) denote the bundle of k 
jets of mappings of N into P. This is a bundle over N x P and also over N 
and over P. For any C* mapping f: N—P, we let j/*(f): N7J‘(N, P) 
denote the k-jet extension of f, We let W, denote the Whitney C* top- 
ology on the set C*(N, P) of C® mappings of Ninto P. This topology has 
for basis the family of sets M(U), for U open in J“(N, P), where we set 
M(U) = {fe C(N, P):i*(/)IN] < UV}. 

We let W denote the topology on C“(N, P) generated by (), W,. 


Definition 1. fe C™(N, P) is stable if there exists a neighborhood U of fin 
C(N, P) (with respect to the topology W) such that for every g € U there 
exists h € Diff N (the set of C ® diffeomorphisms of N into itself) and h’ € Diff P 
such that g=h' ofoh. 

In this paper, we will state a theorem characterizing the stable proper 
mappings in terms of local data. We will outline part of the proof. Details 
will be given elsewhere. 

For any manifold U and any ue U we let C™(U), denote the ring of 
germs at u of C™ functions on U. We let Mt, denote the maximal ideal in 
c’(U),- 

For each ze J*(N, P), we set Q(z) = C*(N),/(f*[M,]-C (NV), + M***), 
where x is the source of z, y is the target of z, and fis any representative of z. 
[In other words <x, y> is the image of z under the bundle mapping of J“(N, P) 
on N x P, and z= j*(/)(x).] Q(z) is an R algebra. The term Mt**! in the 
denominator guarantees that Q(z) depends only on z, not on the choice of f. 


528 


Characterization of Stable Mappings 529 


We will say ze J“(N, P) and z’ eJ*(N’, P’) (where dim N’ = dim N, dim P’ 
= dim P) are contact equivalent if the R algebras Q(z) and Q(z’) are isomor- 


phic. By a contact class in J*(N, P), we mean an equivalence class under the 
relation of contact equivalence. 


Lemma 1. If U is a contact class in J*(N, P), then U is a submanifold. 


PROOF. For any xe N and any ye P we let J‘ , denote the fiber of J* = 
J*(N, P) over <x, y>. We let U,, , denote the fiber of U over <x, y>. It 
is enough to show that U,. , is a submanifold of J*. ,. 

Let K,, denote the group (under composition) of invertible C® germs 
of mappings H:(N xP, x x y)——>(N x P, x x y) such that there exists a 


C@” germ of a mapping /: (N, x)——>(N, x) making Diag. 1 commutative. 
(N, x) —24(N x P, x x y) 25 (N, x) 
h H h 


(N, x) —24(N x P,x x y)2S(N, x) 
DIAGRAM 1 


Let Kx, denote the quotient of K,,, by the subgroup of K,., consisting of 
those H which have the same k-jet as the identity. 

There is a natural action of KX, on Jf , which may be described as 
follows: Let ne Ky, and zeJi,. Let H:(Nx P, xx y)——(N x P, 
x x y) be a representative of 4 and f:(N, x)—-—>(P, y) be a represent- 
ative of z. Let h:(N, x) ——>(N, x) be such that Diag. 1 commutes. Let 
g:.(N, x)——>(P, y) be the unique germ of a mapping such that 


(ly, g>=Ho<ly,f>eoh"! :(N, x) ——(N x P, x x y) 


Then y - zis defined as the k-jet of g. It is easily seen that n - z is independent 
of the choices made and that this defines a left action of Ky , on JX ,. 

The space hae has, of course, a natural structure of a C® manifold. 
Likewise Kf , has a natural structure of a Lie group. The action of Kx, on 
Jk y is C®. In fact, one may say more. We may regard 4 ae (in a natural 
way) as the set of R rational points of a nonsingular variety defined over the 
field Q of rational numbers. Likewise, we may regard Ky , as the set of R 
rational points of an algebraic group defined over Q. The action of Kf, on 
J‘, is Q rational. 

It follows that any orbit of the action of K* , on Jf, is a C® submani- 
fold. It is easily seen that two points z, z’ € Jf , are in the same orbit if and 
only if they are contact equivalent. In particular, U,., is an orbit of this 
action. This completes the proof. 


530 JOHN N. MATHER 


An element ze J"(N, P) will be said to be submersive if any representa- 
tive fof z has the property that 7, : TN, > TP,;,) is onto, where x denotes the 
source of z. Clearly there are no submersive jets in J*(N, P) if dim N < P, 
and the submersive jets constitute a contact class if dim N>dimP. This 
will be called the submersive contact class; all other contact classes will be said 
to be nonsubmersive. 


Lemma 2. If U is a nonsubmersive contact class of J*(N, P) and j*(f) is 
transversal to U, then 

PG) 'U: fF) 'U > P 
iS an immersion. 


We will not give the proof here. 

For any manifold U, we let TU denote the tangent bundle of U. For 
any C® mapping g: U- V, we let 7g: TU TV denote the induced map- 
ping. For any bundle E, we let T(E) denote the set of C® sections of E. 
We define tf: [(7N) > T(f*7P) by tf(€) = Tf. éandwf: T(7P) > T (f*7TP) 
by wf(y) =no°f. (Compare Diag. 2.) 


TN ai 
proj 
¢ [ire aiff 


DIAGRAM 2 
Definition 2. f is infinitesimally stable if tf + wf is onto. 


Proposition A. Let f: NP be a proper C®” mapping. Then (i), (il), (iil) 
below are equivalent, if k > dim P. 
(i) The following two conditions hold: 


(a) j*(f) is transversal to every contact class in J“(N, P). 
(b) If U,,..., U, are nonsubmersive contact classes of J*(N, P), 


then the family of immersions 


SIPS) Uf) 'U;+P i =1,...,4 
has only normal crossings. 
(ii) fis infinitesimally stable. 
(ili) fis stable. 
In the above theorem “normal ae means the following: 
Given a family of C® mappings g;: VV; > O, i=1,...,a of C® manifolds, 
and points x,,...,X,, where each x; is contained in some Viviyy We Say the 


Characterization of Stable Mappings 531 


b-tuple x = <x, ..., X,) 1s a crossing if gj1)(X,) = +++ = Gjy(Xp). Let y denote 
this common value. We say x is a normal crossing if 


b b 
cod 0 T9 ji (TV ic iy)x,)) gc T9 nT V (iy) x,] 


where cod means the codimension. in TQ,. 
The proof goes (i) = (ii) = (iii) > (i). We will now outline the proof 
that (i) => (ii). 


There is a natural identification 


I(f*TP) 
Mt 'TS*TP) 


for any fe C”(N,P), and any xe N, where z = j*(f)(x). T(UJ,"), denotes 
the tangent space to J,“ at z. We consider [(f*7P) as aC ~(N) module in 
the obvious way. [C (NV) denotes the ring of C® functions on N, with values 
in R.] We let Wt, denote the ideal in C~(N) consisting of those functions 
which vanish atx. Note that 2**! (which, by definition, is the set of all sums 
of products f,.../.41, where each f, € M,) is the set of fe C°(N) which 
vanish at x together with their derivatives of order <k (with respect to some 
coordinate system). 
The identification (1) may be described as follows: Let 


Fe T(f*TP), My.” 'T(f*TP) 


Let € be a member of ['(f/*7P) which projects onto . Let /, be a family of 
C®” mappings of N into P, depending differentiably on all variables, and 
defined for ¢ in a small interval (—¢, €) about 0, such that df,/0t|,.. = C. 
For each te(—e, 8), let z, = j*(/,)(x). Then 6z,/dt|,- 9 is in TU,,. We 
identify € with 0z,/Ot|,- 9. It is easily seen that this defines an identification 
of R vector spaces. 

In terms of the identification (1), we obtain the following equalities: 
Trivially, 


T(J x')e = (1) 


M,C f*TP) 


T(S*, ye = 
Us: = ETE GF*TP) 


(2) 


where y is the target of z. Let U denote the contact class containing z. Let 
U,,, resp. U,.,, denote the fiber of U over x, resp. x, y. Then 


tf [MN T(TN)] + (S*LM,] + MET) *TP) 


T(U,,y): = an GETP) (3) 
and 
r * Mkt 1 T ® 
T(U.), = f[MF(TN)] + wf (0(TP)] + (PTR) + ML US TP) (4) 


Mt 'TS*TP) 


532 JOHN N. MATHER 


Formula (3) is an easy consequence of the fact that U,, is the orbit of 
z under the action of Kf,. Then (4) follows from (3) by means of a com- 
putation using local coordinates. 

Clearly the projection T(U), > TN, is onto. Hence there is a sub- 
space W of T(U ), such that the projection W > TN, isanisomorphism. Then 
T(J"). = T(J,); ® W so there is a unique linear mapping zy : T(J"), > T(V/,"). 
such that zy|W = 0 and zy|7(J/,"), = identity. It is easily seen that for any 
subspace V of T(/*),, nw [V + T(U),] is independent of the choice of W. The 
following formula is easily verified using local coordinates, and formula (4). 


tw TH )LTN x] + TU)z1 


_ of (0 (TN)] + of [V(TP)] + (f*[D,] + METS *TP) 5 
mE TT P) " 

Since j*(f) is transversal to U at x if and only if Tj*(f)[TN,] + T(U), 
= T(J"), and since Wc T(U),, it follows that j*(/) is transversal to U at x if 
and only if the left-hand side of (5) is T(/,"),. By the identification (1) this is 
true if and only if the right-hand side of (5) is [(/*7P)/Mt*'T(f*7P). Thus 
we obtain the following: 


Lemma 3. If condition (i-a) of Proposition A holds, then 
tf [T(TN)] + wf [U(TP)] + (S*PM,] + MET )F*TP) = TS*TP) 


for any x € N, where Y = f(x). 
For any subset S of N, let 0, denote the ideal in C “(NV ) consisting of 
functions vanishing on S. Then Lemma 3 can be generalized as follows: 


Lemma 3’. Suppose condition (1) of Proposition A holds. Then for any 
finite subset S = {x,,...,x,} of N, such that f(x,) =--: =/f(x,) = y, 


tf{(T(TN)] + wf(V(TP)] + (f*D,] + MET) S*TP) =TS*TP) (6) 


The proof follows the same lines as the proof of Lemma 3; however, we 
will not give the details. 

Using Lemma 3’, Nakayama’s lemma, and Malgrange’s preparation 
theorem, we will now complete the proof that (1) => (11). 

Let 3, denote the ideal of C”(N) consisting of functions vanishing 
identically in a neighborhood of S. Then (6) (for k > dim P) implies 


Uf[T(TN)] + of LE(TP)] + (F*L9] + Is\F*TP) =TF*TP) (7) 


(We have replaced 9**! by the smaller ideal 3,.) This is shown as follows: 
Set 


E, = of [0 (TN)] + (f*D2,] + Ws) PS*TP) 


Characterization of Stable Mappings 533 


Then £,4, contains all of the terms of the left-hand side of (6), except 
wf(I(TP)]. Also E,4, contains wf(Wi,T(7P)), since this is contained in 
f*(DL)- TS*TP). Thus, it follows from (6) that the R codimension of E,,, 
in '(f*7P) is at most p=dimP. (By the R codimension of one R vector 
space £ in another F, we mean the dimension of F/E, considered as an R 
vector space.) Since Ey > £, >:::>£E,,, and k>p, it follows that 
E, = E;,, for some j,O <j <k. It follows easily that 

where 

D = tf{[T(TN)] + f*[M,] - T(f*TP) 


Now we apply Nakayama’s lemma in the following form. 


Nakayama’s Lemma. Let R be a ring, 3, and 3, ideals in R, M a finitely 
generated module over R, and Nasubmodule of M. Suppose every maximal 
ideal which contains 3, also contains 3,, and that 


M=N+3,M 
Then 


This is well known in the case 3, = 0, and is easily reduced to that case. 
Applying Nakayama’s lemma with R= C%(N), 3, = Ws, 3, = Is, 
M =E,, and N = D, we see that (8) implies 


Hence 
D+3,0(f*TP) > M,T(f*TP) 


Since / <k, this formula and (6) imply (7). Thus, we have established (7), 
assuming that condition (i) of the theorem holds. 
Next, we use (7) to show 


f[E(TN)] + of (T(TP)] + 3s P(f*TP) = V(f*TP) (9) 


The proof that (7) implies that (9) is based on Malgrange’s preparation 
theorem and Nakayama’s lemma. 

As usual, we let C*(N), denote the ring of germs at S of C® func- 
tions on N, and f*: C*(P),>C%(N)s the ring homomorphism given by 
f(g) = 9 of. 


Malgrange’s Preparation Theorem [1]. \f A is a finitely generated C(N )s 
module, and A/f*[,] + A is finite dimensional as an R vector space, then 


534 JOHN N. MATHER 


A is finitely generated as a C“(P), module, where the C°(P), module 
structure on A is induced by /*. 

This is a deep result and will not be proved here. Actually Malgrange 
stated the result only for the case when 4 is generated by a single element, and 
S was a single point, but the extension to the more general case is very easy. 

Let 

r(f*TP) 
~— of [P(TN)] + 35Tf*TP) 


Since C“(N), = C°(N)/35, Aisa C*(N), module. Clearly it is finitely 
generated. Formula(7) implies A = A/f*[9,]- A is finite dimensional as an 
R vector space. In fact, it implies that the canonical images of w/f(e,),..., 
wf(e,) in A span A as an R vector space for any set e,,...,e, of C*(P) 
generators of [(7P). Thus 4 is finitely generated as a C°(P), module, by 
Malgrange’s preparation theorem. Now we may apply Nakayama’s lemma 
with R= C(P),, 3, = M,, 3, =0, M = A, and N the submodule generated 
by the canonical images of w/f(e,),..., of(e,)(e1,...,@, as above). The 
conclusion M = N means that the submodule of A generated by the canonical 
images of wf(e,),..., wf(e,) is all of A, which implies (9). Thus we have 
shown that condition (i) of Proposition A implies (9). (Note that we had to 
know A was finitely generated as a C”(P), module in order to be able to 
apply Nakayama’s lemma.) 

Now we conclude the proof that (i) implies (ii) by showing that (9) hold- 
ing for all finite subsets S of N such that f[S] is a single point implies / is 
infinitesimally stable. 

For any C *(NV) module £, let supp £ denote the set of points x € N such 
that 3,£ 4 £. Suppose £ is finitely generated. Then x € supp £ if and only 
if M, LE # E (by Nakayama’s lemma) and supp £ isclosed. If Sis a finite set, 
then the number of points in supp E 7 Sis less than or equal to dim zg E/Wi, E. 

Suppose ye P and S is a finite subset of N such that f/[S]=y. Set 


_ (f*TP) 
~ -f(I(TN)] 


Let p be the projection of [(/*7P) on E= E/MN,E. By (9), po wf[I(TP)] 
=E. Clearly powf[M,-T(TP)]=0. Hence, 
(TP) 


y 


(10) 


Hence supp Eq S contains no more than p points. Since we may take S to 
be an arbitrary finite subset of f~ '(y), it follows that f~ '(y) ( supp E con- 
tains not more than p points. 

To show ¢f[I(TN )] + wf[F(TP )] = [C(/*7P), we consider an arbitrary 
Cel(f*7P). We let e,,...,¢, be a basis for [(f*7P) over C*(N). For 


Characterization of Stable Mappings 535 


each ye P, let S, = supp En f'(y), where E is given by (10). By the pre- 
vious paragraph Sy, is finite. Hence (9) holds for S=S,, so there exist 
cy EI(TN), ny e€ T(TP), 91,5 -++5Gq.y € Js Such that 


f(E) + Of) + Y G,ye1 = 6 (11) 


Let N, bean open neighborhood of Sin N such thatg, ,|N, =---=9,,,|N, = 0. 
A simple point set topological argument, using the assumption that f is 
proper, shows that there exists an open neighborhood P, of y in P such that 
f-'[P,] a supp Ec N,. Let {P,},¢,4 be a locally finite cover of P by open 
sets such that for each ae A there exists y = y(a) e P such that the closure 


P, of P, is in Pygy. Let {p,},¢4 be a C™ partition of unity on P such that 
supp p, © P, for each x€ A. 
Set 
C’ = Linea S*(paéyay | a are Pa Nya) (12) 
C=C —Uf(’) — of(n) 
By (11), 


‘ > »: Leal (Pa)Gi. yay e; 


We claim ¢’ vanishes identically in a neighborhood of supp £. Clearly 
f*(p,) vanishes on f~'[P — P,] and g;.y,) vanishes on N, = Ny). Hence 
¢’ vanishes on (\,-4(N,Uf ‘[P—P,)]). This set contains supp E because 
f-'[P,] 0 supp Ecf-'[Py.)] O supp Ec N,. It is open because each 
N, Uf '[P — P,]is open and every point in N has a neighborhood contained 
in all but a finite number of the sets f~'[P —P,]. Thus ¢’ vanishes on a 
neighborhood of supp E. 

It follows that for each xe N, there exists €¥ eF(7TN) and gf ,,..., 
9x, q © >» Such that 


(= (E+ Yat (13) 


For each x EN, let W, be an open neighborhood of x in N such that 
Gots +++9G9x.q Vanish on W,. Let {W,},.,4 be a locally finite cover of N 
such that for each «eA there exists x = x(a)e€N such that W, < W,,,). 
Let {o,} be a C® partition of unity on N such that supp o,¢ W,. Set 
E" = Li Gaba: Then, 6.9.4), = 9, 1 <i <q, so (13) yields 


C’ = tf(¢") 
Combining this with (12), we get 


C = tf(¢) + wf(n) 
where € = ¢' + ¢”". 


536 JOHN N. MATHER 


This completes the outline of the proof that (i) implies (ii). 

Now we will sketch the proof that (iii) implies (i). Let NM‘ denote 
the set of g-tuples (x,,...,X,)> € N‘ such that x,# x, ifiAj. Let J*(N, P) 
denote the set of g-tuples <z,,...,Z,» of points in the q-fold Cartesian prod- 
uct (J*(N, P))? such that <x,,...,x,>€N™, where x; is the source of z;. 
If g: NP is any C® mapping, we let ,j*(g) : N‘ > ,J*(N, P) be defined by 
JNK, ee SP CMO), - JIS )(X,)>- 

We next state two results, which together form a complement to Thom’s 
transversality theorem. We say a subset of a topological space is residual if it 
contains a countable intersection of dense open subsets. We say a space is 
Baire if every residual set is dense. 


Proposition 3. C™(N, P) is a Baire space with respect to the topology W. 
The proof is not very difficult, but it is rather long, so we will omit it. 


Proposition 4. 1f U is a submanifold of ,J*(N, P) then {g : ,j‘(g) meets U only 
transversally} is a residual subset of C °(N, P). 

The proof of this is similar to the proof of Thom’s transversality 
theorem. Like the proof of Thom’s transversality theorem, it relies in an 
essential way on Sard’s theorem. We omit it. 

Assuming Propositions 3 and 4, the proof of (iii) =>(i) is easy. First 
we show (i-a). Let U be a contact class in J*(N, P). By Propositions 3 and 
4, there exists a C® mapping g: N — P arbitrarily close to f with respect to 
the topology W and such that j“(g) meets U only transversally. The assump- 
tion that f is stable implies that we may choose g such that f=h’' ogoh, 
where h (resp. h’) is a C® diffeomorphism of N (resp. P) onto itself. There 
exists a unique diffeomorphism h’,h* of J*(N, P) onto itself such that for any 
C® mapping u: NP and any xeN, h’,h*(j“(u)(x)) = j*(A' o uo hy(h'(x)). 
Clearly h’,h* maps U into itself. Thus, the fact that j“(g) meets U only 
transversally implies that h’,h* oj*(g)oh meets U only transversally. By 
definition of h’'*h,, 


h’',h*oj'(g)oh=j(h' ogohy=j(f) 


Thus, j“(/) meets U only transversally. This shows (i-a). 
Now consider nonsubmersive contact classes U,,..., U, in J*(N, P). 
Let x = <x,,..., X,) be a crossing of the family of immersions 


PACU FP) TU;>P i= 1,...,4 (14) 
that is, suppose that each x; is in j(f)~'U,;, for suitable j(/) and that 
S(*;)) = ++: =f(%,). To prove (i-b) it is enough to show that x is a normal 


crossing. 
Let A denote the diagonal in P*%, that is, {<y,...,y>:yeEP}. Let 
1: gJ"(N, P)-—> P? denote the canonical projection (which associates with 


Characterization of Stable Mappings 537 


z= (Z,...,2,)> EJ*(N, P) the point y = ¢y,,..., y,> € P4, where y; denotes 
the target of z,)). Let A=x7'A. Let U=(Ujy) x -+* x Uy) AA. The 
reasoning that was used to show that / being stable implies (i-a) also shows 
that f being stable implies that ,j“(/) meets U only transversally. On the 
other hand, the statement that x is a crossing of the family of immersions (14) 
is equivalent to ,j*(f)(x) € U, and the statement that x is a normal crossing of 
the family of immersions (14) is equivalent to the statement that ,j‘(/) meets 
U transversally at x. Thus the fact that ,j‘(f) meets U only transversally 
implies that x is a normal crossing. This completes the proof that (iii) 
implies (1). 

We have outlined the proofs of (i)=(ii) and (ili)=(i). We have 
omitted outlining the proof of (ii) = (ili), because it is very technical. The 
proof of (ii) = (iii) will appear shortly [2]. 


REFERENCES 


1. B. Malgrange, Ideals of Differentiable Functions, Oxford University Press, 
London, 1966. 


2. J. Mather, Stability of C” Mappings, I, II, to appear in the Annals of Math. 


XXII 


One-Parameter Subgroups Do Not 
Fill a Neighborhood of the Identity 
in an Infinite-Dimensional Lie 


(Pseudo-) Group 


CHARLES FREIFELD 


INTRODUCTION 


In [1], S. Lie considered the question of whether a transformation 
sufficiently close to the identity in a Lie (pseudo-) group can be joined to the 
identity transformation by a one-parameter family of transformations. In 
this paper, we show that this cannot be done in the infinite case. By a Lie 
(pseudo-) group Lie meant a set of local, invertible, analytic transformations 
on some manifold which are the solutions of a set of partial differential 
equations. Lie assumed further that this set always contained the identity 
transformation, that it was closed under composition when composition was 
defined, and that the inverse of every transformation in it was stillin it. For 
example, the set of local, complex-analytic, invertible transformations on C" 
(the equations being the Cauchy—Riemann equations), the set of all invertible, 
local, analytic transformations of R", S" (n-sphere) or C” (the equations being 
the empty set of equations), are Lie (pseudo-) groups. For more examples, 
and a more precise treatment of the concept of Lie (pseudo-) group in general, 
see [6]. 

Throughout this note, we will use the term Lie (pseudo-) group for 
these objects, although the terms “infinite Lie group”’ or “infinite group of 
Lie and Cartan”’ are used in [2] and [6]. Also, Lie considered only analytic 
transformations, while we will consider both the C® and analytic cases. 
Finally, it should be remarked that Lie proved that a neighborhood of the 


538 


One-Parameter Subgroups 539 


identity is covered by one-parameter subgroups when the Lie group is finite- 
dimensional. 

Note that since we always work formally, the result also holds when the 
locally defined 7 (below) is replaced by its germ at the origin. 


PART I 


In C, let U and V be small open neighborhoods of the origin and let 
T: UV be defined by 


T(x) = exp (= ss + ax"? 


where x is a complex number, « is a positive real number, and n a positive 
integer. Then we will prove the following. 


Theorem. There exists an « (small) such that, for all n, T does not lie on a 
One-parameter subgroup of local, C~ transformations. 

In {2], Sternberg showed that 7 does not lie on a one-parameter group 
of local transformations of C into itself, provided it is assumed that the 
transformations are complex-analytic and leave the origin fixed. (On page 
454 of [2], the differential equation should read 


ay (t) = Ba,(t) + a,(0) exp (pBt) 


and the solution should read 


a,{t) = [exp (pt) — exp (Bt) ].) 


p= I) 


But the set of transformations leaving the origin fixed is not a Lie (pseudo-) 
group because it is not the set of solutions of some system of partial differen- 
tial equations, so that this is not, at least immediately, a proper counter- 
example to Lie’s conjecture. The theorem above shows that it is. In our 
situation a will be chosen small and 7 large, so that ‘‘close to the identity” 
will mean that the local transformation and its derivatives are uniformly close 
to the identity in a small neighborhood of the origin. 


CLAIM. A transformation S, leaving the origin fixed, that can be joined 
to the identity by a one-parameter group, must have an eigenvalue equal to 
one. 


PROOF. In fact, if the one-parameter group does not leave the origin fixed, it 
defines a closed curve passing through the origin. This curve is clearly 


540 CHARLES FREIFELD 


pointwise fixed by S and the tangent to this curve is the desired eigenvector 
of eigenvalue one. 

Thus, in the analytic case, where we know T does not lie on a one-param- 
eter group of transformations fixing the origin, we are done. We now 
consider the C” case. Given a transformation, we can consider its formal 
Taylor series expansion at the origin. If 7 lies on a one-parameter group X, 
its formal Taylor series expansion must lie on the one-parameter group of 
formal Taylor series expansions of the X,. We will show that this cannot 
happen by working formally. 

Using z, Z as C™ coordinates, 

T = X ,(z, Z) = exp (=): +az"*! 
Let 
X (z, Z) = a,(t)z + a,(t)Z + a, ,(t)z7 + ay 2(t)zZ + az (t)2?7 +°°° 


be the Taylor expansion of X,. (By the preceding paragraph, we are allowed 
to assume that all X, leave the origin fixed.) Assume x is some large integer. 
First, we determine a,(t): 


X 1)2(Z, 2) © X 1)2(z, 2) = a2(4)(ai(4)z + a24)2 + °° 
+ a,(4)\(@,Q?2 + a,h2Zt+-) +°° 


where the product is computed formally, of course. Equating terms, we have 
2ni ee 
exp ae = a,(4)a,(4) + a2(4)a2(3) 


0 = a,(4)a,4) + a2(4)a,(4) 


Now a,(4) #0 implies a,(4) = —a,(4), which implies that a,(4) is pure 
imaginary. Therefore, a,(4)? is a real number. But a,(4)a,(4) is real also. 
This contradicts the first equation. Thus, 


a,(4) = exp ; (= + dni) 


and a,(4)=0. Repeating this reasoning for X,)4° X,,4, etc., all the equa- 
tions remain in the above form because a,(4)=0. We have, finally, 


a,(t) = exp (tf), 
| 
p= 2ni(— + k) 
n 
and a,(t) = 0, for all ¢, by continuity. Now 


X42, Z) = X(X,(z, Z)) 
SO 
X,(z, z)= Xo (X (Zz, Z)) 


One-Parameter Subgroups 541 
by differentiating with respect to sats =0. Therefore, 
ay'(O)(a,(t)z + an(t)Z + ay, (t)z? + ay, (t)zZ + az, 4(t)Z? + ---) 
ay'(Oa,(t)Z + a2(t)Z + +++) + a4, (0)(a,()z + ---)? 
+ a, (0)(a,(t)z + a2(t)zZ + ‘+ \(a,()Z + a,(t)z + °°") 
+ a3, (0a ,(DE +) + 
= a,'(t)z + a,'(t)z* + a). o(t)zZ + a5, ,(t)27 + °°: 
Thus, 
ay, s(t) = Bay, ,(t) + a4, (0) exp (2ft) 
which, by solving an ordinary differential equation, implies 
Ba, ,(t) = Ky, ,Lexp (2ft) — exp (ft)] 


which in turn implies a, ,(¢) = 0, since a, ,(0) =Oanda, ,(1) = 0 by assump- 
tion. Thus, Ky ,=0. Now 


a, 2(t) = Ba, 2(t) + a‘, (0) 


implies 
a, (t= K,,2|exp(Bt) = 


So a, 2(t) = 0 [a, (0) = a, (1) = 0}. 


a>, 2(t) = Baz, 2(t) + a2, 2(0) exp (—28t) 
implies 
3Baz, 2(t) = K2, 2(exp (Bt) — exp (—28t)) 
SO a, ,(t)=0. Thus, we can continue the procedure and obtain 
4,1, 1(t) = Bay, 4, 1(t) + a4, 1, (0) exp (382) 


The solution of this differential equation shows that a, , ,(¢) =0 for all ¢, in 
exactly the same way as before. Finally, we obtain, for our chosen large n, 


PART II 


Since our considerations have been local, we may, in a suitable neigh- 
borhood of the origin, smooth the transformation T in a C® way so that 
outside the neighborhood, T is just the linear map 


T(x) = exp (=). 


n 


542 CHARLES FREIFELD 


We would like to be able to make T close to the identity at all points, so we 
must be able to make the derivatives of a(x)x" stay close to zero. To accom- 
plish this, one takes a smoothing function which is constantly equal to « 
around the origin and which is constantly equal to zero outside a small 
interval around the origin. Such a function, for example, can be found in 
Steenrod [5]. It involves adding a constant to an exponential of the form 


(x) = wexp (——“—* —} 
a(x) = a —_—_—_—__—_— 

P\@ ox —d) 
where (c, d) is a small interval. By taking « small (depending on n) and (c, d) 
small, the exponentials in the derivatives will control the derivatives of x”, so 
that we can make 


T(x) = exp (=) + a(x)x" 


as close to the identity as we wish. Thus, for the group of global C® 
diffeomorphisms of the complex line—or—since T can be extended to S? 
(it becomes a rotation at infinity)\—for the group of diffeomorphisms of 
the two-sphere S’, no neighborhood of the identity can be covered by 
One-parameter groups. Hence, Lie’s conjecture fails for the group of diffeo- 
morphisms of a compact manifold. 


REMARK 


The Lie (pseudo-) groups we have been considering are not Banach 
(local) Lie Groups in the sense of Maissen [3]. In that paper, it is shown that 
for a Banach (local) Lie group with a certain condition of uniformity on its 
multiplication, a sufficiently small neighborhood of the identity is covered by 
One-parameter groups. Our Lie (pseudo-) groups are not modeled on Banach 
spaces and the extra condition of uniformity, needed to obtain the existence of 
a local solution to an ordinary differential equation in a Banach space, is not 
satisfied because our topology depends on all the derivatives of the functions. 
For references on these infinite-dimensional groups, see Eells [4]. 


REFERENCES 


1. Sophus Lie, Abhandlungen der Sachs. Gesellschaft der Wissenschaften, 
Math.-Phys. Klasse 21 (1895), pp. 43-46. Or, Gesammelte Abhandlungen, Sechster 
Band, pp. 395-399. 


2. Shlomo Sternberg, Infinite Lie Groups and the Formal Aspects of Dynami- 
cal Systems, J. Math. and Mech. 10, (1961), pp. 451-474. 


One-Parameter Subgroups 543 


3. Bernhard Maissen, Lie-Gruppen mit Banachraumen als Parameterraume, 
Acta Math. 108, (1962), pp. 229-269. 

4. James Eells, Jr., A Setting for Global Analysis, Bull. Am. Math. Soc. 72, 
(1966), pp. 751-807. 

5. Norman Steenrod, Topology of Fibre Bundles, Princeton Univ. Press, 1951, 
pp. 25-26. 

6. I. Singer and S. Sternberg, The Infinite Groups of Lie and Cartan, | 
Jour. d’ Analyse Math. XV (1965), pp. 1-114. 


XXIII 


A Dynamical Theory for 
Morphogenesis : Elementary 
Catastrophies on A’ 


RENE THOM 


(Explication du modele différentiel local qui doit permettre 
l’interprétation des changements morphologiques observés en 
de nombreux phénomenes naturels.) Cette théorie sera exposée 
completement dans un ouvrage intitulé “* Stabilité structurelle et 
Morphogenése’’ a paraitre chez W. A. Benjamin, Inc., New York. 


544 


XXIV 


How to Turn a Sphere Inside Out 


STEPHEN SMALE 


‘A Classification of Immersions of the Two-Sphere,’’ Transac- 
tions of the American Mathematical Society 90, 281-290 (1959). 


545 


XXV 


Eversion of the 2-Sphere 


BRYCE S. DEWITT 


The purpose of this talk is to give a nonrigorous “‘ popular ’’ demonstra- 
tion of Smale’s theorem that the 2-sphere can be turned inside out via a differ- 
entiable homotopy of immersions in Euclidean 3-space.' The key to this 
demonstration is the sequence of diagrams shown in Fig. |. If you can 
remember this sequence you can always reconstruct the demonstration. 

The diagrams in the figure represent a sequence of parallel plane sections 
through a certain compact boundaryless 2-manifold which has been immersed 
in Euclidean 3-space. Part of my demonstration will consist of trying to 
convince you that this manifold is the 2-sphere. 

The use of plane sections permits one to visualize immersions of 2-mani- 
folds without resorting to perspective drawings or constructing actual 
3-dimensional models.” __ It is true, of course, that a change in the orientation 
of the parallel planes may drastically alter the appearance of the sequence 
of sections of a given immersion, just as a real object may look quite different 
when viewed from different directions. In particular, the order in which 
critical points appear is subject to wide variation. The critical points in 
Fig. 1 are the dots in diagrams A, B, F, and H. At these points the inter- 
secting plane is tangent to the immersion. The critical points in diagrams F 
and H are saddle points, while those in diagrams A and B are maxima (or 


" An immersion may be loosely defined as a differentiable map from an a-manifold 
into R"*! such that the derived tangent map from the n-manifold into the space of #-planes 
in R"*! is everywhere defined and continuous. 

2 This is not meant to imply that drawings and models, when one has the time to 
make them, are not extremely helpful in yielding an overall visualization of immersions. A 
beautiful set of shaded drawings will be found in an article by A. Phillips in the Scientific 
American 214, No. 5, 112 (1966). These drawings, which were studied in obtaining Fig. 1, 
are based on a sequence of deformations of the 2-sphere devised by A. Shapiro. 
Shapiro’s sequence has suggested to Marcel Froissart and Bernard Morin another sequence, 
less complicated for purposes of 3-dimensional visualization, which they demonstrated 
during the Rencontres by constructing 3-dimensional models. 


546 


Eversion of the 2-Sphere 547 


FIGURE 1. Key diagrams. 


548 BRYCE S. DEWITT 


minima). It is important to take note of all the critical points which occur in 
a given sequence of sections. I shall always assume that the manifold has 
been rotated into a “general position’’ such that the critical points occur 
singly (that is, one by one, with no degeneracies) in the sequence. 

Between any two successive occurrences of critical points the members 
of a given sequence of sections represent a differentiable homotopy of immer- 
sions of some compact boundaryless |-manifold in the plane. A compact 
boundaryless |-manifold is topologically the sum of a finite number of S'’s. 
It is therefore of preliminary interest to discuss the classification of immersions 
of S'. I quote, without proof, the relevant theorem, due to Whitney: 


THEOREM. Two immersions of S' in the plane can be connected by a differ- 
entiable homotopy of immersions if and only if they have the same winding 
number. 


The ‘‘ winding number”’ is defined as follows. Each immersion has 
an orientation determined by the parametrization p(t) = (x(t), y(t)) which 
defines it. Let 7(t) be the unit tangent vector at p(t) in the direction of 
increasing f. Without loss of generality p(t) and 7(t) may be taken to be 
periodic functions of period |. Let © be the angle through which 7(f) turns 
as fruns from 0 to |. Then ©/27z is the winding number. The set of all 
possible winding numbers is the set of all positive and negative integers 
(including zero). You may easily convince yourselves of the plausibility of 
Whitney’s theorem by drawing diagrams such as those of Fig. 2, in which 
various immersions have been grouped according to their winding numbers. 

Each immersion of S' in the plane is two sided; that is, neighboring 
strips on either side of the curve which the point p(t) traces out can be every- 
where consistently distinguished.> One way of making the distinction 
graphically is to paint the strips in different colors. An easier way is to put 
“spikes”? on one side of the curve thus: —-——1————_._ I shall adopt the 
convention that the spikes are to be placed on the side which lies to the right 
when one moves along the curve in the direction of increasing f (as shown by 
the arrows) thus: ———~———_!_.__ In the case of the circle, one of the 
sides may legitimately be called the ** inside”’ and the other the ‘‘ outside.” It 
is a corollary of Whitney’s theorem that the circle cannot be turned inside out 
by a differentiable homotopy of immersions.* 

The concept of ‘‘sidedness”’ can be generalized to immersions of 
n-manifolds. An immersion is always two sided when the corresponding 
manifold is orientable and one sided when it is not. The two sidedness of 


3 The problem of overlap at intersections may be handled by ascribing a 4-sided 


character to the intersection points. 
4 Note the distinction which is made between the circle as an immersion and S' asa 


manifold. 


Eversion of the 2-Sphere 549 


FIGURE 2. Winding numbers. 


immersions of orientable 2-manifolds can be represented graphically by means 
of the sequences of plane sections already introduced. One may place spikes 
on the sectional curves, as in Fig. 1, or arrows, as in Fig. 3 (which depicts two 
distinct slicings of the torus). In each case the emplacement of the spikes or 
the arrows may be carried out in a “‘consistent’’ manner over the entire 
sequence. In the case of a nonorientable 2-manifold, such as the Klein bottle 
depicted in Fig. 4, no consistent emplacement of spikes or arrows can be found. 

Each plane section of an immersion of an orientable 2-manifold can be 
assigned a unique set of winding numbers. If the section contains no critical 
point these numbers are integers, one for each immersion of S' which occurs 
in the section. If the section contains a critical point, fractional winding 
numbers are employed. For example, a critical point which corresponds toa 


BRYCE S. DEWITT 


Eversion of the 2-Sphere 551 


FIGURE 4. The Klein bottle. 


maximum or a minimum will be assigned the number 4+ or the number —4 
according as the circle whose imminent appearance or disappearance it signals 
has the winding number | or —1. In the case of saddle points the assignment 
procedure is more complicated. The immersion is first differentiably dis- 
torted so that the sectional curves running through the saddle point intersect 


552 BRYCE S. DEWITT 


at right angles. If arrows have been consistently affixed to these curves they 
will form the following pattern (modulo a rotation) near the saddle point: 


Starting out from the saddle point in either of the two emergent direc- 
tions one follows the corresponding curve until it returns along one of the 
two re-entrant lines, and one divides the total angle through which the tangent 
vector has turned by 2x. The same recipe is then applied with the opposite 
starting direction. The result is a pair of fractions having either the form 
{m + 3, n+ 3} or the form {m— 3, n — }}, where m and n are integers 
(positive, negative, or zero). 

Applying these rules to Fig. 1, we find that diagram A is characterized 
by the winding-number set {—+4}, diagram B by the set {—1, 4}, diagrams C, 
D, and E by the set {—1, 1}, diagrams F and H by the set {—#, 3}, and dia- 
gram G by the set {1}. By a straightforward extension of Whitney’s theorem 
it can be proved that these sets completely characterize the corresponding 
plane immersions up to differentiable homotopy. Therefore there exists a 
sequence of differentiable homotopies which, section by section, transforms 
Fig. | into Fig. 5, the diagrams of which have winding-number sets identical 
with those of Fig. 1. A graphical portrayal of the distortions involved in the 
case of diagrams F and H is given in Fig. 6. 

The diagrams of Fig. 5 are recognizable as plane sections of a simple 
distortion of the 2-sphere. This suggests that the immersion in 3-space which 
Fig. | represents is obtainable from the 2-sphere by a differentiable homotopy 
of immersions. This is not yet obvious, however, for I have thus far only 
indicated that corrresponding diagrams of Figs. | and 5 are connectable by 
differentiable homotopies of immersions in the plane. I have not extended 
this connectability to an overall connectability via immersions in 3-space. 
That is, I have established a “‘lateral’’ connection between Figs. 1 and 5, but I 
have yet to show that a “‘ vertical’’ connection between the top and bottom of 
each figure (that is, canonical behavior in the neighborhood of each critical 
point and differentiable homotopies between successive occurrences of them) 
can be consistently maintained at every intermediate stage of the lateral 
connection. 

That such a consistent overall connection can, in fact, be established is 
made intuitively clear by the following two-dimensional array of symbols: 


Eversion of the 2-Sphere 553 


A A A A A A GG GGGG 
B B B B BB GG G G&G H,t 
CC C C €C C€ GG G G H,* H,* 
DD D D Cc OD G G G_H,* H,* H;* 
EE E E D C G G_ H,* H,* H;* H,+ 
EE E E £E D G H,* H,* H;+ H,* G 
EE E E E £E H H, H, H; H, F 
EE E €-E €E F,t I H, H, H;~ H, E 
E E E EE F,* F,* I I 4H, H,~ H3;7> H,7 
E E E  F,* F,* F;t I IY I 4H,” H,7 H;7 
E E F,* F,t F,;* F,* I - 1 I 4H, H,7 
E F,*t F,* F;* Fy* E IF | IF J H,- 
F F, F, F; Fy F I rt F I. Tf. J 
G F, F, F; F,~ G I I F§ FY |. € 
G G F, F, F;” F,7 I rt -. TF € D 
GG G F, F, F;7 Ce € CC x 
GG G G&G F, F,7 B B B B BB 
GG GG 6G F,- A A A A A A 


With the exception of those symbols bearing a + or — sign each symbol 
stands for one of the diagrams pictured in Figs. 1, 5, and 6. The + and — 
signs are attached only to the symbols representing the intermediate diagrams 
of Fig.6. Each of these diagrams depicts a plane section containing a saddle 
point. The attachment of a sign to its corresponding symbol indicates that 
the saddle is to be split, by moving the plane either upward (+) or downward 
(—). 

The above array has the following properties: Its first column is a 
stretched-out version of Fig. |, and its last column represents a differentiable 
distortion of the immersion depicted in Fig. 5.° Pick any diagram in the 
interior of the array. The diagrams immediately above and below it, and to 
either side of it, differ from it only by modest distortions (if at all) and are 
related to it in obvious: differentiable ways. It becomes clear therefore that 
each column of the array represents an immersion of S?, and that the immer- 
sion represented by Fig. | is indeed obtainable from the 2-sphere by a differ- 
entiable homotopy of immersions. 

We have still not turned the sphere inside out. How do we accomplish 
this? The answer is to be found in a very special property possessed by the 
immersion depicted in Fig. 1. In this immersion the 2-sphere has been 
folded back on itself, through a maze of self-intersections, in such a way that 


5 One can easily trace the steps of the latter distortion by drawing further arrays of 
diagrams. It is difficult, however, to acquire a three-dimensional visualization of these 
steps, and this constitutes a pedagogical weakness of the present approach. 


BRYCE S. DEWITT 


554 


FIGURE 5S. A distorted 2-sphere. 


Eversion of the 2-Sphere 555 


FIGURE 6. Distortions of diagrams F and H. 


it appears to consist of a pair of self-intersecting surfaces which are separated 
everywhere by a more or less constant distance. It is a simple matter to cause 
the ‘“‘two”’ surfaces to exchange places by a differentiable homotopy of 
immersions. The result of the interchange is then unfolded until it regains 
the form of a 2-sphere. The spikes which were originally on the outside of 
the 2-sphere (Fig. 5) are now to be found on the inside.°® 

It is evidently possible to cause the ‘“‘ two”’ surfaces of Fig. | to exchange 
places in such a way that they exactly coincide at some intermediate stage of 
the homotopy. At this moment of ‘‘crossover’’ they assume the configura- 
tion depicted in Fig. 7. Diagram A has become diagram U, diagrams B and 


6 It is amusing to note that although the spikes are initially on the outside, they are 
all hidden on the ‘‘inside”’ by the time the stage depicted in Fig. 1 is reached. 


556 BRYCE S. DEWITT 


FIGURE 7. Immersion of the projective plane. 


C have collapsed onto diagram V, diagrams D, E, and I have become diagrams 
W, X, and Z, respectively, and diagrams F, G, and H have collapsed onto 
diagram Y. The manifold whose immersion Fig. 7 represents is nonorient- 
able and is, in fact, the projective plane P*. The “‘crossover”’ configuration 
evidently yields a double covering of P? by S*. It was the discovery that this 
double covering can be realized by a differentiable homotopy of immersions 
which led to the recognition that the 2-sphere can be everted. 

By attaching tiny handles to the initial 2-sphere and carrying out the 
same steps as before it is not difficult to show that any compact orientable 
2-manifold can be turned inside out via a differentiable homotopy of immer- 
sions in Euclidean 3-space. It is perhaps well to end with the remark that 
this result in no way contradicts the theorem which states that the embedding’ 


7 An embedding is an injective map. An immersion with no self-intersections is an 
embedding. 


Eversion of the 2-Sphere 557 


of any compact orientable 2-manifold in Euclidean 3-space divides the latter 
into two disjoint regions, the inside and the outside, which are easily distin- 
guished by their compactness and lack of compactness, respectively. When 
we evert such a manifold we do not simultaneously interchange these two 
regions. No homotopy can accomplish that. 


tt 


oes wR 


a POR Ce TT _ 


ele 


& nde Fe AON 2 — 
mere | 


a 


eee 


age use? 
mo. 


% 


jen 


e® 
°a0 8 eceg 
°e 


